Study design and ethics
We performed a randomized, controlled trial with two parallel groups. Subjects were allocated 1:1 to each group. Both the intervention and control groups received the results of a baseline fitness test and an information leaflet on physical activity. A private company provided distance counseling regarding physical activity and provided accelerometers to monitor daily physical activity for the intervention group. The monitors were intended to be used at work and during leisure periods. Counseling was provided by two exercise specialists. The control group did not receive the distance counselling and monitoring intervention. Both groups received the results of a baseline fitness test and an information leaflet on physical activity. The groups were measured at baseline, 6 months (intervention group only), and 12 months, with the primary time point of interest being 12 months.
The objective of the intervention was to increase physical activity and consequently improve work productivity, while decreasing sickness absence. The primary outcomes measured were physical activity, work productivity, and sickness absence. The effectiveness of the intervention was assessed by comparing outcomes in the intervention and control groups at 6 months (physical activity and work productivity) and at 12 months (physical activity, productivity, and sickness absence). The Helsinki University Hospital Research Ethics Board (Coordinating Ethics Committee) approved the study, and it was performed according to the Declaration of Helsinki (2008).
Participants were recruited from a Finnish insurance company located in Helsinki. Recruiting occurred between September 2009 and November 2009. Eligible subjects were: 1) age 18 years or older, 2) in paid employment of at least 8 h a week, 3) not scheduled to retire in the next two years and had not applied for a disability pension, 4) had completed a health risk appraisal and physical testing as a part of normal occupational healthcare, and 5) did not have the following medical conditions: pregnancy, diagnosis or treatment of cancer, or any other condition that would cause a risk to the participant’s health during testing. A complete list of the eligibility criteria is found in the study protocol
Recruiting started with invitations to fill out a health risk appraisal, which was sent to all 1,116 employees of the company simultaneously. Respondents (n = 817; 73%) were invited to complete a fitness test, which was done at the workplace during normal office hours. A total of 596 subjects volunteered for fitness testing. A questionnaire concerning medical history and medication was completed before the strenuous parts of the testing. Forty-six employees were excluded for medical reasons, leaving 550 who were eligible to participate. Six subjects were further excluded from the trial: four were scheduled to retire, one was about to leave the company, and one declined to participate. Therefore, the randomized study population consisted of 544 employees.
The study design, implications of the trial, and alternative options were explained to the subjects in a cover letter. The letter emphasized that participating in the trial was voluntary and employees would get the best treatment available and full attention of the occupational health care when needed, even if they did not want to participate. It also explained that participants were free to withdraw from the trial at any point without any kind of penalty. Employees could ask questions from the research staff about the study, without any obligation to participate in the trial. Each subject who wished to participate individually signed an informed consent form. This form also allowed personal data to be collected from other data registers (health risk appraisal; physical testing; sickness absence records) so that we could add it to the research database and use it for the study. All 544 subjects signed the informed consent, but two declined the use of their sickness absence data.
Block randomization with blocks of ten was used. A biostatistician prepared the randomization scheme in advance by using a computer-generated randomization table. Based on the randomization scheme, two research assistants prepared sealed envelopes containing a referral to either the intervention group or the control group.
Each subject who signed the informed consent form was given a sealed envelope by the research staff according to the randomization scheme. In this way, the researchers were not able to identify group assignments. The subject opened the envelope only after receiving the fitness test results and was not allowed to change groups after randomization.
After randomization, neither the participants nor research staff were blinded to group assignments, due to the nature of the intervention. However, data entry was blinded, as sickness absence data were extracted from the company records automatically in electronic format and computer entry of self-reported data was done by a research assistant who was blinded to group assignments. Data analysts were not blinded.
At the beginning of the study, both groups received written results of their physical exams, and all subjects were given general information on physical activity and health. The results and informational material were briefly explained. Occupational health care continued in both groups as usual.
The intervention consisted of activity monitoring and distance counseling during the twelve-month study period. The subjects assigned to the intervention group were given a uni-axial accelerometer (PAM, model AM 200, PAM BV, the Netherlands) for monitoring daily physical activity. The PAM accelerometer has been found reliable in laboratory settings for estimating energy expenditure in treadmill walking and stair walking
. It produces a single index score that accumulates during the day and is continuously shown on its display. The score is a proxy measure of total daily physical activity.
At the beginning of the study, each subject set a daily PAM score goal in consultation with a counselor. The subjects installed special software on their computers which allowed them to upload their PAM scores to the service provider’s database over the Internet. Each time a subject signed on to the provider’s website, his or her PAM score goal was displayed. This goal could be modified by a user and coach throughout the intervention. On each subsequent login, the website presented all of a subject’s uploaded PAM scores and goals graphically by week or month. Subjects who did not log on to the site every two weeks to upload activity data were intended to receive a phone call or a message from the coach.
Both groups received a questionnaire that was used to measure physical activity and work productivity at the beginning of the study and 6 and 12 months later. A fitness test was done at baseline for both groups, at 6 months for the intervention group, and at 12 months for both groups. Sickness-related absence data was obtained from employer records. Data collection for the study started in September 2009 and continued until November 2010.
Primary outcome measurements
The primary outcomes were 1) physical activity, 2) work productivity, and 3) sickness absence.
The volume (frequency, intensity, duration) of physical activity was assessed by a self-administered questionnaire that we created. The questionnaire enabled comparison of physical activity between the study arms. The questionnaire was based on the International Physical Activity Questionnaire (IPAQ)
The volume of physical activity is often expressed as metabolic equivalents (METs). The weekly volume of physical activity (MET minutes per week) is a product of time, frequency and intensity of physical activity. MET min-per-week was calculated as follows, using the IPAQ scoring protocol: (daily minutes of walking x days per week with walking x 3.3) + (daily minutes of moderate-intensity activity x days per week with moderate intensity activity x 4.0) + (daily minutes of vigorous activity x days per week with vigorous activity x 8.0)
. All daily minutes exceeding 120 were truncated to 120 min, as proposed in the “Guidelines of Data Processing and Analysis of IPAQ Short Version” with the attempt to normalize skewed population data. In processing and cleaning IPAQ questionnaire answers, IPAQ guidelines were followed, except the recommendation to exclude missing data. Instead, we used multiple imputation of missing data.
Work productivity was measured with the QQ instrument
. Respondents assessed how much work they performed effectively during regular hours on their last regular workday as compared with usual. The quantity and quality of work productivity were measured on 10-point numerical rating scales. Two scores from 0 to 10 were given: one for quantity of work and one for quality. A score of 0 represented “nothing” and 10 on each scale represented “normal quantity/quality”
[24, 25]. The quantity and quality scores were multiplied with each other to obtain the QQ score, which was on a scale from 0 to 100. The QQ score correlates with objective work output
Sickness absence was operationalized as the accumulated number of sickness absence days during the study period, excluding weekends. The number of sickness absence days during the 12-month period prior to randomization was used as the baseline. This data was obtained, without medical diagnoses, from employer payroll records. Employees are required to inform the company when sick for one to three days and must provide a sickness certificate when absent for longer than three days. Data privacy was strictly followed. Records were checked for inconsistencies. Maternity/paternity leave and absence from work to care for a sick child were not counted as sickness absences.
Secondary outcome measurements
The secondary outcomes measured were changes in body weight, waist circumference, body fat percentage, blood pressure, and aerobic fitness. These variables were measured during the fitness test at baseline, at 12 months for both groups, and at 6 months for the intervention group. Details on these measures have been described in the study protocol
Sample size calculations
The sample size calculation was based on the following predefined assumptions
. The standard deviation for the IPAQ score in our population was estimated to be 1500 MET min-per-week. We considered a difference of 400 MET min-per-week between the intervention and control groups to be practically significant, detectable with 85% power in two-tailed tests with the alpha of 0.05 for a sample of 253 employees in each group; the standardized effect size is 0.27. Therefore, the obtained study population of 544 subjects was adequate for detecting a practically significant difference with a 7% drop-out rate.
The intervention effect was estimated based on the intention-to-treat principle. Subjects who left for maternity leave, resigned, or retired by the end of the study period were excluded from analysis. The two subjects who declined the use of their sickness absence data were excluded from analysis of sickness absence.
A high number of fitness tests and questionnaires were missing at the 12-month time point. We assumed missingness-at-random and did multiple imputation with Gaussian expectation-maximization algorithm using MATLAB toolbox pmtk3. The number of random imputations used was 20. Imputation covariates included items from the previous tests and questionnaires, such as age, gender, body-mass index, and maximal oxygen uptake. Negative values from imputation were truncated to zero. Imputed values of work productivity were truncated to the allowed range (0–100). We did sensitivity analysis with regard to the imputation procedure by doing a complete case analysis, that is, using data only from subjects who had completed the trial and had no missing data.
For physical activity, work productivity, and each secondary outcome, the difference between the intervention and control groups was estimated using ANCOVA, adjusting for baseline. The analyses were done using the statistical software R
. As the number of observations was relatively large compared to the number of covariates, the results could be interpreted as approximately Bayesian with non-informative priors for the parameters. For sickness absence (SA), we used the hurdle negative binomial model to account for its discrete and non-Gaussian distribution. Due to the complex distribution, full Bayesian inference with hierarchical prior was used.
Hurdle models assume a two-stage process
. In our analysis, the first process (the zero process) determined if a person has any SAs. The second process (the count process) determined the number of non-zero SA days. We used logistic regression and zero truncated negative binomial regression to model the zero and count processes, respectively. In contrast to Poisson regression, negative binomial regression allows overdispersion, which is common in count data. The SA days of the previous year were adjusted for by including them as a covariate in the model. We also included a random effect component to model person-specific levels for SA days. The hurdle negative binomial model was implemented using MATLAB’s GPstuff toolbox
We also performed an exploratory subgroup analysis to detect possible effect modifiers and mediators. The effect modifiers were personal characteristics (age and gender), self-rated level of physical activity, job characteristics (specialist/manager), and sick leave days in the past year, each assessed at baseline. We used physical activity at 12 months as the outcome of this analysis.
Finally, we assessed whether adherence to the intervention was a mediator for the effect on sickness absences. The study population was divided into adhering and non-adhering groups. Those in the adhering group returned the questionnaire and had a physical exam at 12 months. We used the number of sickness absence days during the follow-up year as the outcome, as this information was also available for the non-adhering group. We then assessed the interaction adherence x group assignment using a hurdle negative binomial model.
For differences between the groups, we report the baseline-adjusted mean difference and its 95% Bayesian credible interval (CI). 95% CI is such interval that the difference is within the interval with 95% probability.