Impact of instrument error on the estimated prevalence of overweight and obesity in population-based surveys
- Anna Biehl1, 2Email author,
- Ragnhild Hovengen†1,
- Haakon E Meyer1, 3,
- Jøran Hjelmesæth†2,
- Jørgen Meisfjord†1,
- Else-Karin Grøholt†1,
- Mathieu Roelants†4, 5 and
- Bjørn Heine Strand†1
© Biehl et al.; licensee BioMed Central Ltd. 2013
Received: 25 September 2012
Accepted: 13 February 2013
Published: 18 February 2013
The basis for this study is the fact that instrument error increases the variance of the distribution of body mass index (BMI). Combined with a defined cut-off value this may impact upon the estimated proportion of overweight and obesity. It is important to ensure high quality surveillance data in order to follow trends of estimated prevalence of overweight and obesity. The purpose of the study was to assess the impact of instrument error, due to uncalibrated scales and stadiometers, on prevalence estimates of overweight and obesity.
Anthropometric measurements from a nationally representative sample were used; the Norwegian Child Growth study (NCG) of 3474 children. Each of the 127 participating schools received a reference weight and a reference length to determine the correction value. Correction value corresponds to instrument error and is the difference between the true value and the measured, uncorrected weight and height at local scales and stadiometers. Simulations were used to determine the expected implications of instrument errors. To systematically investigate this, the coefficient of variation (CV) of instrument error was used in the simulations and was increased successively.
Simulations showed that the estimated prevalence of overweight and obesity increased systematically with the size of instrument error when the mean instrument error was zero. The estimated prevalence was 16.4% with no instrument error and was, on average, overestimated by 0.5 percentage points based on observed variance of instrument error from the NCG-study. Further, the estimated prevalence was 16.7% with 1% CV of instrument error, and increased to 17.8%, 19.5% and 21.6% with 2%, 3% and 4% CV of instrument error, respectively.
Failure to calibrate measuring instruments is likely to lead to overestimation of the prevalence of overweight and obesity in population-based surveys.
KeywordsInstrument error Calibration Anthropometry Weights and measures Obesity Overweight Epidemiology
Measurement error and instrument error
The terminology used to describe anthropometric measurement error is not consistent [4, 13]. Four frequently used concepts are precision, reliability, accuracy and validity . The major sources of error associated with weight and height measurements of children are the observer, the child being measured (hydration and bladder contents, clothing, etc.), and the instrument used . The precision and reliability of measurements refers to the extent to which repeated measurements give the same value, whereas accuracy and validity refer to how close a measurement is to its “true” value and is related both to the measurement technique as well as the measuring instrument [4, 15]. Since a true value cannot be determined, in practice a conventional “true” value is used. The accuracy of a measuring instrument is the “ability to give responses close to a true value” [16:p.41]. In the present paper we focus on the accuracy of measuring instruments, which, in cases of inaccuracy will be referred to as instrument error. Components of measurement error associated with measurement technique will therefore be ignored. There are many reasons for the inaccuracy of measuring instruments, including overuse without maintenance or recalibration, incorrect usage and general wear and tear as a result of frequent transportation.
Calibration vs. correction values
Measuring instruments should be calibrated according to a standard and adjusted in case of deviation . Since population-based surveys often involve a large number of measurement sites, standardised calibration of all measuring instruments is an expensive and complicated procedure. As an alternative, the NCG-study developed a method where a correction value was determined for each instrument using a reference weight and a reference height. The correction value is equivalent to the instrument error and corresponds to the difference between the “true” value and the uncorrected value measured by an uncalibrated instrument. The term reference was used, since the method does not satisfy the requirements of a standard within metrology [16:p.46]. The procedure is described in detail below.
The main aim of the present study was to assess the degree to which instrument error, due to uncalibrated scales and stadiometers, impacts upon the overall estimated prevalence of overweight and obesity in population-based surveys. Growth data from a nationally representative sample and correction value data for all the instruments used in the study were utilised to illustrate the implications.
To illustrate how instrument error affects the prevalence of overweight, we generated datasets from real data on weight and height from a sample of 8 year old children, to which simulated data with various degrees of instrument error were added. For each simulation a new prevalence estimate was produced, and after several simulations the mean and variance of the prevalence estimate could be calculated. Finally, to get an idea of how the size of the instrument error affects the prevalence estimate we successively increased the instrument error and re-ran the simulations.
Anthropometric measurements were obtained in 2008 from a nationally representative sample of 3474 third grade pupils (~8 year olds) recruited in 127 primary schools as a part of the Norwegian Child Growth Study (NCG). Participating schools were selected using a stratified two-stage sampling design following a protocol that was jointly developed by the WHO Regional Office for Europe and the participating Member States of the World Health Organization European Childhood Obesity Surveillance Initiative.
The NCG-study was evaluated by the Regional Committee for Medical Research Ethics and approved by the Norwegian Data Inspectorate. Parents and guardians were informed about the study by letter beforehand and written informed consent was obtained from a parent or legal guardian via the school nurse prior to the study.
Prior to data collection all school nurses involved in the study received training in the standardised procedures of anthropometric measurement and the collection of correction values. Methods were explained and illustrated in a booklet specially developed for the NCG-study.
The measuring instruments used in this study were those already available on site, in that the types probably differed from one school to another. However, the usage and positioning of these instruments was standardised, e.g. scales had to be placed on a hard, horizontal floor.
Body weight and height were measured according to standard procedures  and recorded to the nearest 0.1 kg and 0.1 cm respectively. Body Mass Index (BMI) was calculated as weight/height2 (kg/m2). Children were categorised as either under and normal weight (BMI < 25 kg/m2) and overweight or obese (BMI ≥ 25 kg/m2) using one cut-off value .
Correction values were collected in 2008 at the same time as anthropometric data collection. Each of the 127 schools received a reference weight and length that provided data from which to calculate the correction value for instruments at each school. The reference weights and lengths were within the range of the sample’s anthropometric measurements, i.e. a weight close to 28 kg for the scales and a length of 120 cm for the stadiometers.
The anthropometric measurements of the children were corrected post-hoc. The corrected measurements thus correspond to measurements taken by calibrated instruments and are assumed to be free of instrument error, although the instruments themselves were not calibrated.
A 25-l plastic container, filled with cold water, was used as a reference weight. Initially the containers were brought to the Norwegian Metrology Service and each container was numbered (1–127), filled with cold tap water and weighed on a calibrated scale. The reference weight of each container – the “true” weight - was registered in a protocol. The containers were then distributed to the participating schools and the school nurses were instructed to fill up the container with cold tap water until the last drop trickled over, cap it, weigh it once according to the standardised method and register the weight. Finally, we calculated the correction value (instrument error) for all 127 scales to be the difference between the “true” weight of the container and the uncorrected weight of the corresponding container measured at the school scale by the school nurse.
In cases where the measurements of the children were carried out over two days, the procedure of weighing the plastic container was not required to be repeated as long the scales were not moved.
Data analyses and simulation
To investigate the impact of instrument error on the estimated prevalence of overweight including obesity (BMI ≥25 kg/m2) we simulated random samples of corrected height- and weight data derived from the NCG-study and correction values (instrument error) were added in each simulation (n=1000). The coefficient of variation (CV) is an expression for the magnitude of instrument error and is useful to compare the variability in samples involving different measurements (weight in kilograms and height in metres).
An overview of the two models presenting the coefficient of variation (CV%), mean and SD of instrument error and the corresponding estimated prevalence of overweight and obesity (BMI ≥ 25 kg/m 2 ) as mean, SD and the range (minimum-maximum values)
Estimated prevalence of overweight and obesity (%)
0 (16.4 – 16.4)
0.17 (15.9 – 17.1)
0.24 (16.0 – 17.5)
0.28 (16.3 – 18.0)
0.35 (16.7 – 19.2)
0.39 (17.2 – 19.6)
0.41 (18.1 – 20.7)
0.46 (19.1 – 22.1)
0.51 (19.5 – 23.2)
0.25 (15.8 – 17.7)
All simulations were programmed in STATA 11.
On average it was observed that the instruments in the NCG-study underestimated weight and height measurements. The correction values of scales ranged from −3.0 to +1.7 kg, mean (SD): –0.14 (0.64) and from −3.5 to +4.0 cm, mean (SD): –0.07 (1.11) for stadiometers. In the NCG-study the CV of instrument error of the scales and stadiometers was 2.1% and 0.9%, respectively. This indicates that the CV of instrument error was at least twice as large for scales compared to stadiometers.
The prevalence was underestimated in a minority of the simulated samples. When the CV of instrument error was 1%, about 1 in 12 of the simulated samples underestimated the true prevalence. When the CV was 1.5% this happened only 1 in 200 of the simulated samples, and when the CV was 2% or greater no simulated samples underestimated the true prevalence.
In the second model, using the actual size of the CV of instrument error of the NCG-study, uncorrected estimates showed an average overestimation corresponding to 0.5 percentage points of the prevalence of overweight and obesity among 8 year-olds (Table 1).
In this study we found that instrument error due to uncalibrated scales and stadiometers, combined with cut-off based classification systems, can lead to minor but systematic overestimation of the prevalence of overweight and obesity in a nationally representative sample.
To the best of our knowledge, this is the first study to show how increased variance of anthropometric measurements can affect outcome measures in obesity surveillance. Previous studies have considered the accuracy of anthropometric measurements, but did not evaluate the impact on the estimated prevalence of overweight including obesity (BMI ≥25 kg/m2) . It has been reported that scales in healthcare settings did not have higher accuracy than scales in fitness- or weight loss centres . Furthermore, beam-balance scales are more accurate than scales with electronic mechanisms, whilst bathroom-type scales with a spring mechanism are least accurate . These findings indicate that it is possible to increase accuracy by ensuring the school health service has good quality instruments.
Our findings are particularly relevant for population-based studies and surveillance programs that maximise the use of existing resources, such as instruments in the school health service. It is important to consider and balance accuracy with feasibility . We thus developed a simple yet effective procedure for the NCG-study in order to obtain correction values for instruments at each school. This need was clearly demonstrated by the wide range of instrument error (4.7 kg and 7.5 cm) observed in this study from 127 scales and stadiometers. Valid data were collected with limited costs. We also found that, on average, instruments in the NCG-study slightly underestimated both weight (mean: –0.14 kg) and height (mean: –0.07 cm) measurements. However, this information was not used in the current analyses since the aim of this study was solely to assess the effect of instrument error variation. The average of the error terms was therefore set to zero in the simulations. If the mean error term was not set to zero, but rather set to the values observed in the NCG-study, the entire distribution would have shifted to the left due to the negative means, whilst the effect of “heavy tails” demonstrated in this study would have been equalised or toned down.
A possible limitation of the approach adopted in the NCG-study, i.e. adjusting for the correction value, is that it may only be valid for the specific point on the measuring-scale that corresponds to the reference weight and length value. To ensure the collection of the correction value on the appropriate part of the measuring-scale in weighing scales, a reference weight was chosen within the range of value of our target population. For stadiometers corrected at one point, measurements are likely to be correct along the whole measuring-scale.
According to the procedures, the plastic containers used to collect correction values were only measured once. Duplicate measurements would have increased the reliability, but optimisation and feasibility must be considered. Overly complicated procedures would undoubtedly increase the risk of dropout. In the present study, no schools dropped out. At the majority of schools, the data collection was completed within one day.
Only a single average cut-off value for overweight and obesity was used for the entire sample of 8–9 year old girls and boys in the simulations, and not age- and sex specific cut-off values as recommended by the IOTF . This was done deliberately, in order to simplify the analyses. The expressed aim of the study was to explore the phenomenon of increased variance rather than to present correct estimates of the prevalence of overweight including obesity. For the same reason, the two-stage sampling methodology was not taken into account in the simulation analysis.
The rationale behind running the simulations according to two different models was to illustrate, in the first model, that an increasing instrument error will systematically impact upon the estimated proportion of overweight and obesity in surveys. The second model contained the actual CVs of instrument error for scales and stadiometers derived from the NCG-study and serves as a realistic example.
In the literature it is often stated that instruments are calibrated prior to data collection, without giving detailed calibration procedures. It has been claimed that once instruments are installed and calibrated, error due to the instruments is negligible . Our findings suggest the contrary. Indeed, the impetus for this paper stems from the finding that old flooring had been removed from a school health office without adjustments to the wall-mounted stadiometer and with subsequent inaccurate height measurements. It shows that even when measuring instruments are calibrated upon acquisition and installment they can become uncalibrated, underlining the need for procedures for regular maintenance of anthropometric instruments. Generally, measuring instruments are bought calibrated, but due to lack of awareness, they may never be maintained or recalibrated after years of use.
To the best of our knowledge, this is the first study to demonstrate that instrument error will systematically increase the estimated prevalence of overweight and obesity. This study emphasises the need for regular maintenance or recalibration of instruments in order to reduce instrument error and to obtain more accurate estimates in populations-based surveys. Although the present analysis is limited to the use of inaccurate height and weight measuring instruments, the outlined principle is generic and can be applied to measurement error in general.
This study is a collaboration between the Norwegian Institute of Public Health and the Morbid Obesity Center at Vestfold Hospital Trust, and was funded by the South-Eastern Norway Regional Health Authority. The funders had no role in the study design or the interpretation of the data or the decision to submit the article for publication. The work has been presented at the European Childhood Obesity Group (ECOG) meeting in Dublin in September 2009. We would like to thank Kåre Bævre for his invaluable comments on analysis methods at an early phase of the study. Thanks are also due to Matthew McGee for proofreading the final manuscript.
- Katzmarzyk PT, Baur LA, Blair SN, Lambert EV, Oppert JM, Riddoch C: International conference on physical activity and obesity in children: summary statement and recommendations. Int J Pediatr Obes. 2008, 3: 3-21.View ArticlePubMedGoogle Scholar
- Barlow SE, Expert Committee: Expert committee recommendations regarding the prevention, assessment, and treatment of child and adolescent overweight and obesity: summary report. Pediatrics. 2007, 120 (Suppl 4): 164-192.View ArticleGoogle Scholar
- de Onis M: The use of anthropometry in the prevention of childhood overweight and obesity. Int J Obes. 2004, 28 (Suppl 3): 81-85.View ArticleGoogle Scholar
- Ulijaszek SJ, Kerr DA: Anthropometric measurement error and the assessment of nutritional status. Br J Nutr. 1999, 82: 165-77. 10.1017/S0007114599001348.View ArticlePubMedGoogle Scholar
- Himes JH: Challenges of accurately measuring and using BMI and other indicators of obesity in children. Pediatrics. 2009, 124 (Suppl 1): 3-22.View ArticleGoogle Scholar
- Townsend N, Rutter H, Foster C: Variations in data collection can influence outcome measures of BMI measuring programmes. Int J Pediatr Obes. 2011, 6: 491-498. 10.3109/17477166.2011.605897.View ArticlePubMedGoogle Scholar
- Schlegel-Pratt K, Heizer WD: The accuracy of scales used to weigh patients. Nutr Clin Pract. 1990, 5: 254-257. 10.1177/0115426590005006254.View ArticlePubMedGoogle Scholar
- Oberlander J, Saunders RC, Knox L, Crosby L, Mullen J: Survey of weight scale variability. J Parenter Enteral Nutr. 1981, 4: 606-Google Scholar
- Cole TJ, Flegal KM, Nicholls D, Jackson AA: Body mass index cut offs to define thinness in children and adolescents: international survey. BMJ. 2007, 335: 194-10.1136/bmj.39238.399444.55.View ArticlePubMedPubMed CentralGoogle Scholar
- Cole TJ, Bellizzini MC, Flegal KM, Dietz WH: Establishing a standard definition for child overweight and obesity worldwide: international survey. BMJ. 2000, 320: 1240-1243. 10.1136/bmj.320.7244.1240.View ArticlePubMedPubMed CentralGoogle Scholar
- Harris HE, Ellison GT, Holliday M, Nickson C: How accurate are antenatal weight measurements? A survey of hospital and community clinics in a South Thames Region NHS Trust. Paediatr Perinat Epidemiol. 1998, 12: 163-175. 10.1046/j.1365-3016.1998.00100.x.View ArticlePubMedGoogle Scholar
- Larsen RJ, Marx ML: An introduction to mathematical statistics and its applications. 1986, Englewood Cliffs, N.J.: Prentice-HallGoogle Scholar
- Cameron N: The method of Auxological Anthropometry. Human Growth a comprehensive treatise. Edited by: Falkner F, Tanner JM. 1986, New York: Plenum Press, 3-46.Google Scholar
- Pederson D, Gore C: Anthropometry measurement error. Anthropometrica: a textbook of body measurement for sports and health courses. Edited by: Norton K, Olds T. 1996, Sydney: University of New South Wales Press Ltd, 77-96.Google Scholar
- Cameron N: Human growth and development. 2002, San Diego, Calif: AcademicGoogle Scholar
- International vocabulary of basic and general terms in metrology. 1993, Geneva: International Organization for Standardization
- Physical status: the use and interpretation of anthropometry. Report of a WHO Expert Committee. WHO Technical Report Series 854. 1995, Geneva: World Health Organization
- European Union: Directive. 2004, http://eur-lex.europa.eu/LexUriServ/LexUriServ.do?uri=OJ:L:2004:135:0001:0080:EN:PDF, /22/EC of the European Parliament and of the council of 31 March 2004 on measuring instruments. EU,Google Scholar
- Stein RJ, Haddock CK, Poston WS, Catanese D, Spertus JA: Precision in weighing: a comparison of scales found in physician offices, fitness centers, and weight loss centers. Public Health Rep. 2005, 120: 266-270.PubMedPubMed CentralGoogle Scholar
- Welk GJ, Corbin CB, Dale D: Measurement issues in the assessment of physical activity in children. Res Q Exerc Sport. 2000, 71 (Suppl 2): 59-73.View ArticlePubMedGoogle Scholar
- The pre-publication history for this paper can be accessed here:http://www.biomedcentral.com/1471-2458/13/146/prepub