Development and validation of a pre-scoring system for nonspecific low back pain among general population in Guangzhou: a cross-sectional study

Background Nonspecific Low Back Pain (NLBP) is a common disease with a low cure rate and significant impact on the population. This study aimed to develop and validate a pre-scoring system for identifying the risk of suffering from NLBP among the general population in Guangzhou. Methods A total of 1439 eligible subjects were surveyed in Guangzhou by stratified random sampling and was divided randomly into the development dataset (69.6%) and validation dataset (30.4%) subsequently. Based on the development dataset, potential associated factors (average exercise times weekly, the intensity of daily work, etc.) with NLBP were tested by the sequential logistic regression, and a pre-scoring system was formulated with Sullivan’s method and graded afterward. The internal validity of the system was assessed by AUC and calibration plot, and the external validation was performed in the validation dataset. Results The prevalence rates of NLBP in the development dataset and the validation dataset were 12.97 and 13.27%, respectively. Age, BMI, average exercise times weekly, gender, educational level, the intensity of daily work, place of residence, monthly income, overall evaluation of health condition and physiology health were identified as significant factors. The total risk score ranged from 0 to 38, which was split into three risk grades: low risk (0 to 18), intermediate risk (19 to 22) and high risk (23 to 38). The pre-scoring system had an adequate calibration and a good discriminating ability with bootstrap-corrected AUC equaling 0.861 in the development dataset and 0.821 in the validation dataset. Conclusions A pre-scoring system that could help clinicians to assess the risk of NLBP in the general population was validated. Further validation of the system in a new population or prospective cohort study is suggested. Electronic supplementary material The online version of this article (10.1186/s12889-019-7564-9) contains supplementary material, which is available to authorized users.


Introduction
Non-specific low back pain (NLBP) is defined as tension, soreness and/or stiffness that exists in the lower back region for which there is not a specific cause of the pain [1][2][3][4][5][6]. As a severe public health problem in the world for many years, NLBP has an approximate demission rate of 39% from work and meanwhile it has been one of the most common reasons for using complementary and alternative medicine [7][8][9][10]. The prevalence of NLBP ranges from 10 to 49% among the population of different ages, and even as much as 60-85% during an individual's lifetime [11][12][13].
Literature indicates NLBP attributes to multiple risk factors, which include gender [12], smoking [14], BMI [7] and improper sitting posture [15], etc. More studies show that females have a higher prevalence of LBP across all age groups than males, and postmenopausal women are more susceptible to it than young or middleaged women due to female hormone fluctuation and menstruation [12]. Given lifestyle, sedentariness or longstanding for over 2 h has been found to increase the likelihood of having NLBP [15]. Clinically, such problems as scoliosis, low back muscle endurance, abnormal trunk mobility, and muscle imbalance are higher risk factors [16][17][18][19][20].
VonKorff and Miglioretti [21] developed a risk score system to identify patients at risk of chronic LBP, which included pain severity degree, interference with usual activities number of other pain and number of days with back pain in the prior 6 months. Hill et al. [22] developed a brief screening tool to identify subgroups of patients for initial treatment in primary care. Janwantanakul et al. [23] built another risk scoring system to identify office workers susceptible to LBP based on a prospective cohort study. However, their findings, along with other literature, are somewhat inconsistent and not validated by external data [23][24][25][26][27][28]. To our knowledge, no scoring system to identify the general population at risk of NLBP has been comprehensively established. The purpose of the present study was to construct a pre-scoring system to assist health care providers in identifying individuals' potential probability of suffering from NLBP with only several readily available clinical data.

Source of data
This field investigation was conducted from August 2013 to May 2014, and a total of 2100 participants were surveyed using a stratified sampling approach. Briefly, in stage 1, as the Nansha district is newly established and geographically far from the city centre of Guangzhou, we selected the remaining 11 districts in Guangzhou. One community from each district was randomly selected in stage 2. Finally, we randomly selected individuals from the selected communities (age 20-59). And only one participant was selected from every household. The work was approved by the Institutional Review Board at Zhujiang Hospital, Southern Medical University, Guangzhou, China (NO. 2013-BLK-009).

Outcome definition
The body diagram from the standardised Nordic questionnaire was used to identify the location of low back pain [29]. In the questionnaire, the following questions were for the primary diagnoses of NLBP, "Have you ever been diagnosed with such lumbar diseases as a lumbar disc herniation, lumbar hyperosteogeny, lumbar muscle strain, lumbar degenerative disease or rheumatism?" "Is this the first time for you to suffer from low back pain?" "How long does your low back pain last?" In this study, low back pain lasted for a duration between 6 weeks and 12 months, and confirmed not from lumbar diseases was defined as NLBP [1].

Participants and predictors
A questionnaire was designed for the survey, in which participants' private information was omitted. All participants were informed consent and voluntary in this survey.
The inclusion criteria were as follows: (1) age between 20 and 59 years, (2) no deformity or asymmetry in the spine or lower limbs, (3) no mutilation, (4) no problems in reading and communication [23].
The researchers were trained in advance to assist the participants in completing the questionnaire, which included demographic, work-related and psychosocial data as well as the presence of NLBP.
The demographic data included birthday, height, weight, gender, nation, educational level, smoking habits, drinking habits, marital status, average exercise times weekly, place of residence and monthly income.
The work-related factors included the type of occupation, main nature of work, the intensity of daily work, job position and exposure or not to any vibration sources at work.
To assess the quality of life, the participants were also asked to complete the Chinese abbreviated version of World Health Organization Quality of Life (WHO-QOL-BREF-Chinese)-Brief, which consisted of 26 items in four domains (physiological health, psychological health, social relation health and environmental health) and two general evaluations about the quality of life and health condition.
Pre-survey was conducted three times to correct items and evaluate the reliability and validity of the questionnaire before formal data collection. The content validity of the questionnaire was assessed by six experienced reviewers, including one biostatistician, one epidemiologist, two surgeons, one physician, and one community manager. The Cronbach'α was 0.818, which displayed an acceptable outcome.

Sample size and missing data
It was difficult to calculate the sample size for the observational study, especially in multivariable regression model settings. We used the rule of thumb recommended by Peduzzi et al. [31] and Harrell et al. [32], namely, events per variable (EPV) being 10 or higher in our study. If there were about ten significant associated factors with NLBP, a minimum of 100 (10 × 10) participants should have the event in the sample.
Since the data had no more than 2% missing values, we imputed it with the EM algorithm to assure the stability of the results. All results were based on the imputed complete dataset.

Statistical analysis
Continuous variables were expressed as Mean ± S.D., while categorical or ordinal variables were expressed as absolute (n) and relative (%) frequency. All the subjects were randomly divided into two sets, a development (69.6%) dataset, and a validation (30.4%) dataset. Three steps were taken to develop the pre-scoring system based on the development dataset. Firstly, we conducted univariate logistic regression to select possible associated factors with a P-value ≤0.1. Then, a backward logistic regression was used to select potential associated factors (demographic and work-related factor) and to construct a basic model. Finally, psychosocial factors, four domains and two general evaluations about quality of life, were evaluated sequentially based on the above basic model. And a risk model was developed sbsequently. The incremental prognostic usefulness of psychosocial factors was evaluated by the integrated  discrimination improvement (IDI) and the continuous net reclassification improvement (NRI) [33]. Based on the developed risk model, we created a simple pre-scoring system subsequently by Sullivan et al.'s method [34]. Firstly, we classified the continuous variables into categories in terms of clinical significance. Secondly, we specified the mid-point value as the reference value for each category. To determine the reference values for the first and last categories of continuous variables, we use the 1st and the 99th percentile to minimize the influence of extreme values. Thirdly, the lowest risk category of each variable was served as the base category. The difference of reference values between each category and base category multiplied by the regression coefficient of the corresponding variable in the risk model was defined as the distance of each category from the base category. Fourthly, One score of the scoring system was defined as a constant of 0.48 which means the increase of risk associated with a 5-year increase in age (0. 096 × 5). Finally, the base category of each variable was assigned 0 scores. And the score of other categories was computed by dividing corresponding distance with the constant of 0.48 and then rounded to the nearest integer.
The score was then summed to create a total risk score for each participant, and the participants were classified into three grades: low risk, intermediate risk, and high risk of NLBP.
The discrimination of the models and the system were measured by Areas Under the ROC Curve (AUC). Calibration of predictions was assessed by the calibration plot. The internal validity of the system was assessed by bootstrap techniques, and the external validation was performed in the validation dataset.
If the correlation coefficient between variables was ≥0.60, only the variable judged to be more clinically relevant was included in the model. Confirmatory factor analysis was conducted to recheck the structure validity of the WHO-QOL-BREF-Chinese. All statistical calculations were performed on SAS software (v. 9.3; SAS Institute Inc., Cary, NC). A 2-tailed P value < 0.05 was considered as statistically significant.
The reporting of the present study closely follows the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) statement [35].

Characteristics of the participants
The flow chart of participants is presented in Fig. 1. Totally, 2100 questionnaires were distributed and a total of 1953 ones were successfully collected, with a response rate of 93.0%. Among them, 514 were excluded, including 31 aged under 20 years, 78 aged over 59 years, 142 with abnormal or asymmetric spine or lower limbs, 213 reported pregnancies or spinal, intra-abdominal or femoral surgery in the past year and 69 with a musculoskeletal disorder. The final sample covered 1439 participants, 1002 (69.6%) and 437 (30.4%) of whom were assigned to the development and validity datasets, respectively. Totally 188 (13.1%) were confirmed with NLBP, with 130 (13.0%) in the development dataset and 58 (13.3%) in the validation dataset. The participants aged 34.41 ± 9.34 years on average; their body mass index (BMI) was 21.92 ± 3.03, and 717 (50.0%) of them were female. More detailed baseline characteristics of the eligible participants for both the development and validation datasets are shown in Table 1.

NLBP risk model
The demographic and work-related factors significantly associated with NLBP in the basic model were as follows: age, BMI, average exercise times weekly, gender, educational level, the intensity of daily work, place of residence and monthly income ( Table 2). The AUC of the basic model was 0.838 (95%CI: 0.798-0.878). Two psychosocial factors (the overall evaluation of health condition and physiological health) were added to the final risk model. The risk model had an excellent discriminating power with an AUC of 0.868 (95%CI: 0.830-0.905) and was significantly more effective than the basic model (0.868 vs 0.838, P < 0.001). Neither co-linearity nor interaction effects were significant.
NLBP pre-scoring system and risk category The associated factors and corresponding scores for calculating the risk score of NLBP are presented in Table 3.  The estimated probability, according to the proposed risk score, was expressed as: where −11.194 and 0.48 were the intercept and slope coefficient from the model, respectively. The risk of NLBP was calculated based on the total score, ranging from 0 to 38, with corresponding predicted probabilities from 0.0 to 99.9% (Table 4). The bootstrap-corrected AUC of the pre-scoring system was 0.861 (95%CI: 0.822-0.898).
In the validation dataset, the AUC of the system was 0.821 (95%CI: 0.758-0.883). The receiver operating characteristics curves (ROCs) of the basic model, the risk model, and the system are shown in Fig. 2. The calibration plots of both datasets appeared no apparent overor under-estimation (Fig. 3).
To illustrate the application of the risk score, consider a 30-year-old man with a height of 1.75 m, weight of 70.0 kg, average exercise times weekly of 2, education level of college degree or above, the intensity of daily work of intergrade, place of residence of urban, monthly income of > 10,000 RMB, overall evaluation about health condition of satisfied (SF2, 4 score), and physiology health of 14 score (ranging from 5 to 20 score). His total risk score is: 0 (Gender) + 2 (Age) + 1 (BMI) + 2 (Average exercise times weekly) + 4 (Educational level) + 3 (the intensity of daily work) + 0 (Place of residence) + 6 (Monthly income) + 0 (SF2) + 2 (SF_PHYS) = 20 from Table 3, and the estimated predicted probability that he has NLBP is 16.88% according to Formula 1. We also provided a convenient Excel tool for individuals to acquire the underlying risk of NLBP by entering their personal information (Additional file 1: Excel S1).
We created three NLBP risk categories: low risk (0 to 18 scores), intermediate risk (19 to 22 scores), and high risk (23 to 38 scores), to enhance the practical utility of the system. The categories were created by identifying the groups of scores that resulted in "significant" (pvalue < 0.001) differences in the prevalence rate of NLBP between pairwise categories. The possibilities of developing NLBP in three categories in the development dataset were 4.3, 14.8, and 67.3%, respectively. The corresponding results in the validation dataset were 5.0, 15.5, and 66.7%, respectively (Table 5).

Main finding
The prevalence rate of NLBP in our study was 13.1%, which is consistent with the findings of 10.0-49.0% in literature according to the definition of NLBP [5,[11][12][13]36]. The validated pre-scoring system we developed based on ten factors had a high discriminative power at the bootstrap-corrected AUC of 0.861 and was strongly supported by the external validation (AUC of 0.821). These factors were age, BMI, average exercise times weekly, gender, educational level, the intensity of daily work, place of residence, monthly income, overall evaluation of health condition and physiology health. To our knowledge, this is the first simple and validated prescoring system for identifying the risk of NLBP among the general population. An excel evaluation tool was in the Additional file 1.

Interpretation
Many cohort studies and meta-analyses have been conducted to explore the potential risk factors of NLBP, and the effects of some interventions have been confirmed by well-designed randomised control trials [37]. However, few of them focus on the readily available psychosocial data. In this research, we have not only studied demographic and work-related characteristics, but also concentrate on the psychosocial features. The results show that the overall evaluation of health condition (OR = 0.48) and physiology health (OR = 0.77) are two unneglectable  Table 3 Scores of the associated factors in the pre-scoring system Contrary to the results from the majority of literature, our findings show that the higher the income, the higher the risk of NLBP and so it is with the educational level. In our consideration, it may depict a picture in China context where the better-educated cohort is more likely to have access to a higher-income and more prone to being sedentary in the office. The realistic situation is that the cohort of higher-income tends to drive to work or travel, which keeps the waist muscles consistently in a state of intense stress. Yue et al. [38] pointed out that prolonged sitting and static posture are two potential risk factors of NLBP in China. Cocker et al. [39] and Hadgraft et al. [40] found that more occupational sitting is associated with higher income and education level in Australia. Other literature also gave a similar conclusion about the relation between NLBP and education in our findings [38,[41][42][43][44][45][46]. On the other side, the cohort of lower income cannot afford a car and therefore tends to go out on a walk or by bicycle. Although they are more likely to be exposed to vibration while working, the influence may be undermined by those mentioned above two significant factors. As it is seen, in the meantime, both the cohort of higher income and higher education, are consistent in doing physical exercises all through the communities in the developed world.
Some researchers have used different statistical models to predict the development of NLBP [23][24][25]30], such as Classification Tree Model, Artificial Neural Network, Bayesian method, et al. Their outcomes are moderate and not validated by external population. The logistic regression model owns pretty well stability and easy-tounderstand results, especially in medical research, so we use it in our study. A few researchers in their highly cited papers use different weighted methods based on regression coefficients to develop risk scores [47,48]. In our research, we chose Sullivan's approach, a generally accepted score construction method, to develop the scoring system, which will make it more stable [34].

Clinical and decision-making implication
The present study derived and validated a prescoring system rather than a decision rule. It is to provide information that allows clinicians and patients to understand risks faced by patients, and then take actions to reduce the risk of NLBP. The factors incorporated in the constructed pre-scoring system are readily accessible data in general clinical work. The proposed system may help clinicians to identify patients at low risk of NLBP quickly. And among patients with a high risk of NLBP, a more detailed assessment of pain and a diagnostic test would then be needed to quantify the risk of NLBP adequately. Therefore, stratification of the risk level of NLBP patients according to the pre-scoring system is clinically relevant, particularly in disease prevention or care setting where efficient assessment is essential.
Comparison with other risk score Janwantanakul et al. [23] constructed a screening tool based on the previous history of LBP and psychological demand (assessed by the Job Content Questionnaire) with a good AUC of 0.76 (95% CI 0.68-0.83). Jensen et al. [28] used baseline clinical and psychosocial risk factors to predict patients with low, intermediate and high risk for an unsuccessful return to work, both initially and at 1-year, and yielded an excellent predictive effect. However, our pre-scoring system was developed among the general population whose population characteristic and risk exposure were different from those mentioned above. The AUC of the proposed system reached 0.821 in the validation dataset, demonstrating its excellent discrimination in the general population.

Strengths and limitations
In this manuscript, our system enjoys the following properties: (1) It included a relatively large number of participants; (2) EM imputation was applied to utilize information as much as possible and ensuringd the robustness of the results; (3) the pre-scoring system was of high discriminating power as well as high stability.
Our study has several limitations as follows. First, it is not well suited to establish the causal relationship between exposure factors and outcome with a cross-sectional study. Therefore, the system in this  study should be taken as a preliminary result. Secondly, there might be a bias if the findings are applied outside Guangzhou. Thirdly, we overlooked some potential risk factors in rehabilitation medicine or physical therapy (such as muscle imbalance) that may introduce some bias in our result. While some studies indicated that exercise could reduce muscle imbalance [49,50], which stated that exercise, an essential variable in our model, is likely to explain the partial effects of muscle imbalance on NLBP. Finally, the bootstrap procedure, based on the developed risk score and did not include the variable selection step, might lead to the estimated bias of over-optimism in a way.

Conclusions
We validated a pre-scoring system based on eight demographic and work-related features and two psychosocial factors that may be useful for assessing the risk of NLBP among the general population in Guangzhou.

Additional file
Additional file 1: Excel S1