Large scale trials are needed to assess the potential impact of increased PA on reducing pregnancy complications and these trials should ideally use objective measurement methods to measure changes in PA [1, 3, 37]. Pedometers would be a cost-effective measurement tool for large studies, provided that the simple step count measure is broadly comparable to the more specific accelerometer data. Our results show that although there was a significant correlation between pedometer step counts and most accelerometer measures of PA and no difference in median step counts between the two devices, the 95% limits of agreement were very broad, especially among those participants who were less active. Additionally, the direction of difference between the monitors appeared to reverse across the range of activity levels suggesting a complicated pattern of disagreement. Agreement between the monitors in categorising participants as active and inactive varied from fair to good depending on the criteria adopted, being good when achievement of ≥8,000 steps/day was used as the criterion.
Pedometer step counts have been compared to accelerometer data in a number of previous reports [18, 20–23], all of which used one of the Actigraph/CSA/Manufacturing Technology Inc. (MTI) accelerometer models and one of the Yamax pedometer models, as in the present study. These accelerometer models are not entirely comparable to each other in measuring steps and activity counts [38, 39]. The study by Harrison et al.  included 30 overweight or obese pregnant women at 26 to 28 weeks' gestation in Australia. The participants wore an accelerometer (GT1M Actigraph) and a pedometer for 5 to 7 days and the accelerometer data processing rules were very similar to those used in our study. Despite a statistically significant correlation (r = 0.69, p < 0,01) between the step counts of each monitor, the mean difference was 505 steps/day and the limits of agreement were large (from -2491 to 3501 steps/day).
The other studies were not conducted in pregnant women and generally included subjects who were more active than our participants. However, the findings were essentially similar to those in the present study. Tudor-Locke et al. , in a study of 60 adult volunteers in South Carolina, USA, observed a high correlation between accelerometer (CSA model 7164) and pedometer step counts (r = 0.86), but the accelerometer detected 1845 ± 2116 more steps/day on average than the pedometer and the limits of agreement were even broader (-2387 to 6077 steps/day) than reported in the present study. In a larger study of older adults in the UK (n = 121) , Harris et al. reported that pedometer step counts were highly correlated to the accelerometer (Actigraph GT1M) step counts (r = 0.86) and the mean step counts were similar. However, the limits of agreement were again large, around -3500 to 3500 . Ramirez-Marrero et al.  reported similar findings among 58 adults with HIV in Puerto Rico. Although the limits of agreement were not calculated in that study, individual variation in differences in step counts seemed to be large. When comparing pedometer step counts to other accelerometer (Actigraph model 7164) measures of PA, the correlations observed in the present study are generally similar, although weaker than in the previous studies [20, 22, 23]. Macfarlane et al.  observed among 57 adult volunteers in Hong Kong that the means for accelerometer (MTI model 7164) measures of PA increased with increasing pedometer step counts, but the confidence intervals were broad.
Amongst the previous studies, Tudor-Locke et al.  were the only authors to conclude that agreement between pedometer step counts and accelerometer measures of PA was unacceptably low, despite others also reporting broad limits of agreement [18, 23] or confidence intervals . Future studies should pay more attention to correct interpretation of Bland-Altman plots and limits of agreement.
The present study confirms the findings of these studies in a sample of 58 overweight and obese pregnant women. Although there are currently no methods available to calculate 95% confidence intervals for the limits of agreement determined by a regression based method, it is important to note that a larger sample size would not have affected the size of the limits of agreement. There are also no guidelines for acceptable 95% limits of agreement for step counts. We propose that they should be no larger than ± 500 steps/day (i.e. a range of 1000 steps/day), which is likely to correspond to a maximum of 10 min difference in the duration of MVPA, such as brisk walking , and may therefore be of clinical and public health importance.
The difference between the accelerometer and the pedometer step counts was correlated to the mean of both measures in the present study, but not in the others [18, 20, 23]. In this study, the difference between the step counts was in the opposite direction for less active and more active women, i.e. accelerometer detecting more steps among less active women and pedometer detecting more steps among more active women. This discrepancy may be related to the general limitations of the monitors or differences in their sensitivity to detect PA. Pedometer accuracy is reported to be diminished at slow walking speeds, especially below 3 miles/h [31, 40] and both active and inactive participants undertook many episodes of light intensity activity in the present study. This may also be the case with some accelerometers, although the GT1M model used in our study has been shown to have lower intermonitor variability and lower sensitivity for light intensity activity than the preceding 7164 model [38, 39]. On the other hand, the previous CSA model has also been reported to erroneously detect slightly more nonsteps e.g. when travelling by a motor vehicle .
The accuracy of the latest Actigraph accelerometer model (GT3X) and Yamax Digiwalker SW-200 was recently investigated in 30 pregnant women . Both monitors underestimated the number of steps especially at slow walking speeds, but positioning the monitors at a tilt angle did not correlate with the percentage of actual steps detected by either monitor. In contrast, Crouter et al.  suggested that the tilt angle reduced the accuracy of spring-levered pedometers in overweight or obese adults. The tilt angle may also reduce the accuracy of accelerometers in assessing vertical movement, which may happen more often among overweight and obese than normal weight people . The tilt angle was not directly measured in the present study. However, BMI and gestational age did not significantly modify the results of the Bland-Altman plot suggesting that the potential effect of the tilt angle on the results may have been the same regardless of BMI or gestational age.
We also assessed agreement between the monitors in categorising participants as active or inactive. Whilst agreement was relatively good (Kappa 0.63) when using 8,000 steps/day as the criterion for both monitors, agreement was lower (Kappa 0.45) when comparing participants achieving 8,000 pedometer steps/day to those achieving 30 min of MVPA/day measured by accelerometer. Of the previous studies, Ramirez-Marrero et al.  reported fair agreement between 10,000 pedometer steps/day and 150 min MVPA/week (Kappa 0.25, p = 0.01). These discrepancies between pedometer and accelerometer in categorising participants into active and inactive may partly be due to the data processing rules, such as selection of the epoch length and intensity cut points to define MVPA for the accelerometer data.
Accelerometry should not be regarded as a gold standard to measure free-living PA nor necessarily as a more accurate method of measuring daily steps than pedometer. Although the previous version of the Actigraph accelerometer the only commercially available accelerometer  that correlated reasonably with doubly labelled water, most validation studies have been conducted in controlled environments. Validity is lower when applied to free-living settings . Two armband accelerometers have also recently been shown to be highly correlated with doubly labelled water in free-living conditions . Both the accelerometer and the pedometer measure biomechanical body movement. Hence, validity of the monitors against energy expenditure should not be a major concern when assessing agreement between these devices.
This study had some limitations. Firstly, although the participants were asked to record the times when they wore the monitors, we cannot be sure that both monitors were worn exactly for the same time. Some women had very low pedometer step counts but moderate accelerometer step counts suggesting that the wearing time may have been different for each monitor or the pedometer may have been in a tilt angle. Therefore, studies in controlled conditions, such as by Connolly et al. , would be necessary to be certain that monitors were worn for exactly the same time. Secondly, almost all of our participants were in the first or second trimester of pregnancy and therefore we do not know whether the results can be generalized to the third trimester, when activity decreases and the abdominal circumference is much larger. On the other hand, these results may be generalized to non-pregnant overweight and obese women of similar age.
Thirdly, participation was low (33%) and 34% of the participants were excluded because of fewer than three valid days of accelerometer data. The activity levels of the participants were similar or slightly lower than those reported in pregnant women in other comparable studies [5, 6, 16, 43]. The purpose of this study, however, was to compare methods of measurement, rather than obtain representative estimates of PA levels in pregnancy.