Skip to main content

Psychometric performance of EQ-5D-5L and SF-6DV2 in measuring health status of populations in Chinese university staff and students



To compare measurement properties of EQ-5D-5L and SF-6DV2 in university staff and students in China.


A total of 291 staff and 183 undergraduates or postgraduates completed the two instruments assigned in a random order. The health utility scores (HUS) of EQ-5D-5L and SF-6DV2 were calculated using the respective value sets for Chinese populations. The agreement of HUSs was examined using intraclass correlation coefficients (ICC) and Bland-Altman plot. Convergent validity of their HUSs and similar dimensions were assessed using Spearman’s correlation coefficient. Known-group validity of the HUSs and EQ-VAS score was assessed by comparing the scores of participants with and without three conditions (i.e., disease, symptom or discomfort, and injury), as well as number of any of the three conditions; their sensitivity was also compared.


The ICCs between the two HUSs were 0.567 (staff) and 0.553 (students). Bland-Altman plot found that EQ-5D-5L HUSs were generally higher. Strong correlation was detected for two similar dimensions (pain/discomfort of EQ-5D-5L and pain of SF-6DV2; anxiety/depression of EQ-5D-5L and mental health of SF-6DV2) in both samples. The correlation between the two HUSs were strong (0.692 for staff and 0.703 for students), and were stronger than their correlations with EQ-VAS score. All the three scores could discriminate the difference in three known-groups (disease, symptom or discomfort, number of any of the three conditions). The two HUSs were more sensitive than EQ-VAS score; and either of them was not superior than the other.


Both EQ-5D-5L and SF-6DV2 HUSs have acceptable measurement properties (convergent validity, known-groups validity, sensitivity) in Chinese university staff and students. Nevetheless, only EQ-5D-5L (PD and AD) and SF-6DV2 (PN and MH) showed indicated good convergent validity as expected. Two types of HUSs cannot be used interchangeably, and each has its own advantages in sensitivity.

Peer Review reports


Generic preference-based measures EQ-5D and SF-6D are two well-known and widely used instruments to measure health-related quality of life (HRQOL), which can be converted into health utility scores (HUS), in clinical trial, economic evaluation or population health survey [1,2,3,4]. For example, EQ-5D was the HRQOL instrument in the China National Health Service Survey launched since 2008 [2]. EQ-5D and SF-6D both were recommended instruments for utility measurement for health economic evaluation by China Pharmaceutical Economics Evaluation Guidelines (2020 edition) [3].

The EQ-5D, which was developed by EuroQol in 1996 [5]. It currently has two versions, EQ-5D-3L and EQ-5D-5L, both of which include five dimensions.The original version EQ-5D-3L, categorizes the five dimensions into three severity levels: no problems, moderate problems, and extreme problems, capable of defining 243 (35) unique health states. Because the EQ-5D-3L was found to be insensitive to mild or even moderate differences in HRQOL and greatly limited by the ceiling effect [6], EQ-5D-5L was developed in 2009 with two more levels in each dimension (slight problems and severe problems) to categorize health status [7]. The EQ-5D health states can be converted into HUS, which anchors at 1 (full health) and 0 (death), following country or population specific value sets. As expected, EQ-5D-5L has demonstrated better measurement properties than EQ-5D-3L [7,8,9]. The SF-6D, which was based on the SF-36 was developed by Brazier et al. in 2002 [10]. The SF-6D has two versions (SF-6DV1 & SF-6DV2) corresponding to the two versions of SF-36 [11]. SF-6DV1 has the disadvantages of unclear severity ordering of dimensions and limited sensitivity [12]. SF-6DV2 addresses them by simplifying level descriptions and providing clearer wording, and is thus with better reliability and validity [11, 13,14,15]. Similarly, each SF-6D health state can be translated into a HUS based on a certain value set for SF-6D.

Although both EQ-5D and SF-6D measure the same concept of HRQOL and provide HUS, their measurement performance was not the same in different populations, such as general populations in Asia [16,17,18,19,20,21,22]. For example, a study in China general population suggested that SF-6DV2 is more sensitive in distinguishing participants with and without chronic diseases [22]; while a study in Thailand general population found a better sensitivity of EQ-5D-5L in distinguishing participants with different in characteristics gender, age, education level, household income, and number of diseases [17]. The studies generally not mentioned the use order of the two instruments, which is an important factor influencing the comparison results. In addition, no study has compared their performance in university staff and students.

Recently, an increasing number of studies have begun to measure the Health-Related Quality of Life (HRQOL) of university staff and students. Both populations are under great pressure of occupational overload and employment-related school performance, respectively [23,24,25,26]. This would adversely affect their physical and mental health and consequently HRQOL [23, 25]. Indeed, a study has shown that university staff have lower HRQOL than the general population in China [24]. However, as there are currently no specific measurement tools developed for assessing the HRQOL of university staff or students, the EQ-5D or SF-6D are commonly employed for this purpose [27,28,29,30]. Hence, it is important to select the most appropriate measurement tool based on different populations in use. This study thus aimed to compare the measurement properties between EQ-5D-5L and SF-6DV2 in university staff and students in China by randomly assigning their use order.


Study design and population

This is a web-based health survey targeted at the highly-educated populations, i.e., university staff and students currently working or studying in one of public universities. The questionnaire was distributed through the largest online survey platform in China, Wen Juan Xing (Changsha Ranxing Information Technology Co.,Ltd., Hunan, China). Wen Juan Xing, equivalent to Qualtrics, Survey Monkey or Cloud Research, provides online questionnaire design and survey functions for the customers. The study took a snowballing sampling method with a convenient sample composed of colleagues, friends and acquaintances. Then the questionnaire was circulated via Wechat working groups, personal invitation and unofficial announcement by the existing respondents among study population. The participation was completely voluntary and incentives were not provided in any form. The study was approved by the IRB committee of the Air Force Medical Center in Beijing (KongTe: NO 2021-169-PJ01).

Data collection

The online questionnaire collected variables about health determinants such as demographic (age, gender, height, weight), lifestyle or behavioral (smoking, drinking) and socioeconomic (education, marital status) factors. Additionally, the conditions (diseases, symptoms, discomforts) which can directly influence individual HRQOL were systematically collected. In answering the online questionnaire, either EQ-5D-5L or SF-6DV2 was randomly assigned first and then followed by the other. This is to eliminate the ordering-effect when measuring the same property with different instruments.


EQ-5D-5L inquires an individual’s HRQOL on the day of survey using two parts: a health-state descriptive system and a visual analog scale (EQ-VAS). The system includes five dimensions:mobility (MO), self-care (SC), usual activities (UA), pain/discomfort (PD), and anxiety/depression (AD). It measures 3125 health states in total, each expressed in five-digit numbers for EQ-5D-5L, combining the levels of five dimension each [31]. For example, EQ-5D-5L “52341” means extreme problems in mobility, slight problems in self-care, moderate problems in usual activities, severe pain/discomfort, and not anxious/depressed. In this study, we used the Chinese EQ-5D-5 L value set developed by Luo et al. to calculate EQ-5D-5L HUS [32] (Table1). The HUS for EQ-5D-5L state “52341” is 0.248. EQ-VAS is a 20 cm vertical visual scale, ranging from 0 (worst imaginable health) to 100 (best imaginable health), and reflecting the respondents’ self-rated overall health status [33].

Table 1 Characteristics of EQ-5D-5L and SF-6DV2


SF-6DV2 assess HRQOL of individuals covering the last 4 weeks in six dimensions i.e.physical functioning (PF), role limitation (RL), social functioning (SF), pain (PN), mental health (MH), and vitality(VT), which have 5–6 functioning levels. SF-6DV2 can measure 18,750 health states, each of which is indicated by a six-digit number combining the levels in six dimensions. “312654” means your health limits you a little in moderate activities, you have no problems with your work or other regular daily activities as a result of your physical health or other activities as a result of your physical health or any emotional problems, your health limits your social activities a little of the time, you have pain extremely, you feel tense or downhearted and low all of the time, and you have a lot of energy a little of the time. The SF-6DV2 value set in China developed by Wu et al. was used in the study [15] (Table1). According to it, the HUS for SF-6DV2 state “312654” is 0.204.

Statistical analysis

Descriptive statistic was conducted to depict respondent characteristics, the response distribution to the EQ-5D-5L and SF-6DV2 dimensions, their HUSs, EQ-VAS score, and the overall ceiling effects (the proportion of no problems in all the dimensions). Continuous variables were expressed as mean and standard deviation (SD), and categorical variables as frequency and percentage.

The agreement between EQ-5D-5L and SF-6DV2 HUSs was tested by intra-class correlation coefficient (ICC), which was computed with the two-way mixed effects model based on absolute agreement. ICC ranges from 0 to 1 and a value < 0.5, 0.5–0.75, and > 0.75 indicate excellent agreement poor, moderate, and good agreement, respectively [34,35,36]. Bland-Altman plots were also constructed to visually examine the utility differences of two instruments. The agreement is deemed perfect if the between-instrument differences have a mean of 0 and randomly scatter within the 1.96 SD around the mean [37, 38].

Convergent validity of EQ-5D-5L and SF-6DV2 similar dimensions (i.e., MO and SC vs. PF, UA and RL vs. SF, PD vs. PN, AD vs. MH) (Appendix 1) and their HUSs were evaluated by using the Spearman’s correlation coefficient (r): >0.5 (strong correlation), 0.35–0.5 (moderate correlation), 0.20–0.35 (weak correlation), and < 0.20 (poor correlation) [39].

Known-groups validity of EQ-5D-5L and SF-6DV2 HUSs was assessed by testing their ability in identifying different subgroups with known differences in health status. Following that, the sample have been classified independently according to the self-reported clinical conditions, i.e., disease, symptom or discomfort in 12 months, injury in 12 months, and number of the three conditions. Those with the condition or more conditions were believed to have worse health status than their respective counterparts. The p-value of the F test in ANOVA test was used as the indicator. Their sensitivity was compared using relative efficiency (RE), which was calculated based on the ratio of F-statistic values [40]. A higher RE indicates a better ability to detect statistically significant difference between subgroups. In this study, the F-statistic of EQ-5D-5L HUS was used as the reference to calculate the RE of SF-6DV2 HUS and EQ-VAS score. As a result, RE <1 means EQ-5D HUS is more effective.

Data were analyzed using SPSS 26.0 and STATA 17.0 software. All the analyses were two-sided and tested with a significance level of p < 0.05.


Characteristics of the two samples

There were 474 respondents among which 291 were university staff. The student sample enrolled 99 undergraduates and 84 postgraduates (Table2). The mean ages of staff and students were 39 (9.6) years and 25.0 years (8.5) respectively. The faculty had slightly more males (55.3%) while student sample got more females (54.6%). The proportion of smoking habit was below 15% in both samples. The mean BMIs were 21.6(3.4) and 23.6(4.11) for staff and students respectively. A bigger proportion of students (54.6%) maintained normal BMI than faculty (54.6%). Compared to the students, staff reported higher prevalences of diseases and symptom/discomfort than the students but lower prevalence of injuries in the past year. These are expected as staff were older while students were more active and risk-taking.

Table 2 Characteristics of University staff and students (N = 474)

HRQOL profile

As shown in Table3, EQ-5D-5L was affected by the high ceiling effect that was 43.3% in measuring staff and 51.4% in measuring students. More than 92% of respondents reported “no problems” on “Mobility”, “Self-care” and “Usual activities” in both samples as they were generally considered healthy and able to carry out daily tasks. With regard to the PD and AD, 41.6% of staff and 30.1% of students reported problems on these two dimensions. The similar response distributions were observed in two samples while staff systematically reported more problems than students. Accordingly, the mean HUS of staff was 0.92 (0.11) which was lower than 0.95 (0.08) for students, and the mean VAS of staff was 77.5 (14.8) which was also lower than the students 84.5 (14.4).

Table 3 Distributions of responses to each of the EQ-5D-5L dimension in the two samples

The response distributions of SF-6DV2 exhibited a different pattern from EQ-5D-5L (Table4). Either university staff or students were found by SF-6DV2 to have more problems than by EQ-5D-5L. Less than half of staff sample reported “no problems” across each six dimensions. The worst was the VT dimension where only 30 (10.3%) faculty member did not feel tired in the past four weeks. Similar to the staff, students also had most problems in the VT dimension. Like EQ-5D-5L, students showed better HRQoL profile than staff. With more problems detected by SF-6DV2, the mean HUS in SF-6DV2 was 0.76 (0.14) in staff, which was 0.16 significantly lower than that derived by EQ-5D-5L. Likewise, the mean HUS of students was also significantly lower at 0.82 when measured by SF-6DV2 than EQ-5D-5L. The latter derived a mean HUS of 0.95 for students. The ceiling effects associated with the SF-6DV2 was lower than the EQ-5D-5L with 7.6% vs. 43.3% and 20.2% vs. 51.4% respectively in staff and students respectively.

Table 4 Distributions of responses to each of the SF-6DV2 dimension in the two samples

For the students, the EQ-VAS and EQ-5D-5L HUS scores were severely skewed while the SF-6DV2 appeared to follow a uniform distribution. The skewness being -2.027, -3.035 and -0.359 for EQ-VAS, EQ-5D-5L and SF-6DV2 HUS all followed left-skewed distribution. SF-6DV2 HUS was more evenly distributed than the other two scores. While the HUS of EQ-5D-5L was more concentrated between 0.8 and 1.0; EQ-VAS was mainly concentrated on 80–100 (Fig.1).

Fig. 1
figure 1

Distribution of EQ-VAS score,EQ-5D-5L,and SF-6DV2 utility scores in the two samples

Agreement between the EQ-5D-5L and SF-6DV2 utility scores

The HUSs of two instruments were in moderate agreement with the ICCs being 0.567 and 0.553 for the staff and the students respectively. The agreement displayed by the Bland-Altman appeared to confirm this. The two samples Bland-Altman analysis all showed that over 95% points were within the limits of agreements (University staff: 99.95%; (University student: 99.97%). The HUSs by EQ-5D-5L were normally higher than those measured by the SF-6DV2. But in cases where subjects had low HUSs (< 0.6), EQ-5D-5L produced lower HUS than SF-6DV2. This observation appeared in both samples (Fig.2).

Fig. 2
figure 2

Bland-Altman plot of the EQ-5D-5L and SF-6DV2 utility scores in the two samples

Construct validity

According to Table5, several similar dimensions of EQ-5D-5L and SF-6DV2 failed to show good convergent as theoretically expected. Specifically, SF-6DV2 PF correlated weakly with EQ-5D-5L MO and SC dimensions in both samples. However, the EQ-5D-5L PD and AD dimensions showed strong correlations with the similar dimensions of the SF-6DV2 PN and MH, respectively. The correlation coefficients were 0.748 and 0.563 among staff; and 0.623 and 0.645 among students. Discriminant validity was suggested that SF-6DV2 MH dimension was not significantly correlated with the pure physical constructs, MO, SC or UA, of EQ-5D. What was noteworthy was that SF-6DV2 RL and VT constructs tended to had stronger and more significant correlations with PD and AD dimensions of EQ-5D-5L, rather than MO, SC or UA. This followed the previous report that SF-6DV2 is more socially oriented whereas EQ-5D-5L is more physically oriented.

Table 5 Correlations of the dimensions of EQ-5D-5L and SF-6DV2 in the two samples

Correlations between EQ-VAS score, EQ-5D-5L and SF-6DV2 HUSs are shown in Table6. For university teachers, the coefficients were 0.592 (EQ-VAS and EQ-5D-5L HUS), 0.570 (EQ-VAS and SF-6DV2 HUS), and 0.692 (EQ-5D-5L and SF-6DV2 HUSs), respectively, all indicating a strong correlation. For the students, the coefficients were 0.421 (EQ-VAS and EQ-5D-5L HUS), 0.442 (EQ-VAS and SF-6DV2 HUS), and 0.703 (EQ-5D-5L and SF-6DV2 HUSs), respectively. Among them, EQ-5D-5L and SF-6DV2 HUSs have stronger correlation in both samples.

Table 6 Correlation of EQ-VAS score, EQ-5D-5L and SF-6DV2 utility scores in the two samples

Known-groups validity and sensitivity of the utility scores

The results of known-groups validity and sensitivity for EQ-VAS score, EQ-5D-5L and SF-6DV2 HUSs are shown in Table7. Among university staff, EQ-VAS score, EQ-5D-5L and SF-6DV2 HUSs all found significant differences for two known-groups (with and without disease, with and without symptom or discomfort, and number of any of the three conditions). SF-6DV2 HUS was more efficient than EQ-5D-5L HUS and EQ-VAS score in detecting the three conditions (RE > 1 for both). On the other hand, EQ-5D-5L HUS and EQ-VAS score were more sensitive than SF-6DV2 HUS in identifying the staff with and without injury. EQ-5D-5L HUS was also more discriminative than EQ-VAS score for two known-groups (i.e., with and without symptom or discomfort, and with and without injury). In contrast, EQ-VAS score was more discriminative than EQ-5D-5L HUS in the two known-groups (with and without disease, and number of any of the three conditions).

Table 7 Known-groups validity and sensitivity of EQ-VAS score, EQ-5D-5L and SF-6DV2 utility scores in the two samples

Among the students, both EQ-5D-5L and SF-6DV2 HUSs could detect significant difference in all the known-groups. And EQ-5D-5L HUS was found to be better efficient than SF-6DV2 and EQ-VAS in detecting differences in two known-groups (with and without symptom or discomfort, and number of any of the three conditions) (RE < 1), while SF-6DV2 HUS was better efficient than EQ-5D-5L HUS in detecting disease and injury (RE > 1 for both).


Measurement performance of GPBMs varied a great deal across populations and GPBM instruments were normally not interchangeable. This phenomenon has necessitated the research on the psychometric performance of even widely-used GPBM in specific populations and decision-making settings. This study investigated the psychometric properties of EQ-5D-5L and SF-6DV2 of two samples populations living a life in the higher-education sector. The results showed that the EQ-5D-5L and SF-6DV2 HUSs had acceptable convergent validity and known-groups validity. Nevertheless, only EQ-5D-5L (PD and AD) and SF-6DV2 (PN and MH) showed the expected good convergent validity. Although HUSs of two questionnaires were in moderate agreement, they were not be interchangeable. The SF-6DV2 seems to be preferred in the study populations as it displayed a lower ceiling effect and better distributional property than the EQ-5D-5L.

Similar to the previous findings [17, 40], the EQ-5D-5L systematically yielded higher HUS than SF-6DV2 in both staff and students The HUS differences of 0.17 and 0.13 respectively for staff and students reached the statistical significances. This may suggest that EQ-5D-5L has overestimated the health status given its ceiling effect 5.7 and 2.54 times that of SF-6DV2. It further illustrates an important issue that the choice of HRQOL measurement tool would substantially affect the decision-making about resource allocation in the context of higher education. Two reasons could account for the differences. First, the EQ-5D-5L utility is determined by the self-ranked health status on the day of survey while the SF-6DV2 covers a longer period of health status over the past four weeks. Thus, the SF-6DV2 theoretically has captured more health-related problems than the EQ-5D [22]. For example, a respondent could be free of pain/discomfort on a single day but may have experienced it some time in the past four weeks. Second, the SF-6DV2 HUS has unique contribution for the dimension Vitality, which would reflect extra HRQOL impairment.

The overall agreement of the HUSs was moderate between the two instruments with ICC being 0.567 and 0.553. The value is lower than the ICC discovered in a sample (n = 19,177) drawn from the general population (ICC = 0.75) [15]. The visual inspection of agreement by Bland-Altman plots (Fig.2) demonstrated not only the systematically higher utility of EQ-5D-5L relative to SF-6DV2, but also some consistency. These findings are in line with prior results [41, 42]. The plots showed that, in the lower range of HUS, HUS differences increased, and interestingly, the EQ-5D-5L produced lower HUS than the SF-6DV2 when the HUS below certain threshold. This could be attributed to the difference in the utility scoring functions of the two instruments. That is, the difference in coefficients of the two functions is in general increased along the health severity; and the EQ-5D-5L scoring function tends to generate lower HUS for health states with severe or very severe problems (Table1).

Regarding the construct validity, strong correlations were observed between the similar dimensions (PD and PN; AD and MH) of EQ-5D-5L and SF-6DV2. However, the correlations were not as strong for other theoretically similar dimensions (MO/SC with PF in both samples, UA and SF among university staff). The finding in line with the results of a general population study in China [20], which may also be attributed to the different connotations of the similar dimensions. For instance, EQ-5D-5L MO/SC both involve simple activities (walking, bathing or dressing), whereas SF-6DV2 PF includes both high-intensity and moderate-intensity activities (running, lifting a table, etc.). In addition, the EQ-5D-5L puts emphasis on physical functions, while SF-6DV2 is more socially related [5, 43, 44]. In reality, usual activities can be performed without social contacts. So the UA dimension of EQ-5D-5L may not be strongly correlated with the SF dimension of SF-6DV2 in our case. In addition, we find that correlations seem to occur more frequently in the employee sample than in the student sample.There may be two possible reasons. Firstly, university staff have worse health than the students, hence the two instruments tend to converge in identifying the same HRQOL problems. Secondly, university students are likely to have a greater variety of daily activities than staff, and therefore PF correlates slightly less with MO and SC in university students than in university staff. The staff-student difference was supposed to be related to both occupations and age. So far the evidence is rare directly comparing the HRQOL of university staff and students. However, studies have shown that the younger age is associate with better health status [45, 46]. On the other side, there exists research indicating the suboptimal HRQOL of college teachers from an occupational health perspective [23]. We also found strong correlations among the three overall health indicators, showing convergent validity of two instruments in our study population. Furthermore, the degree of correlation between the HUSs was stronger than their correlations with EQ-VAS score. This may be due to the fact that both the two HUSs reflect health preferences of general Chinese population. This is different from the result of Thai general population that the correlation between EQ-5D-5L HUS and EQ-VAS score was stronger than their correlations with SF-6DV2 HUS [17]. The study utilized the SF-6D value set in the UK and the EQ-5D value set in Thailand, which may explain the differences in results.

In terms of known-groups validity, the HUSs and EQ-VAS scores have discriminated the majority of groups with known difference in health states, supporting that discriminant validity of the questionnaires. However, the exception occurred to the injury condition, for which both HUSs and EQ-VAS appeared weak to discriminate staff with or without injury. This finding may be attributed to two factors. Firstly, co-existence of disease and symptom/discomfort on the staff without injury in the last 12 months (75 university staff). Secondly, injury of university staff being minor. Meanwhile, they had different sensitivity in distinguishing the difference in HRQOL between the known-groups. The EQ-5D-5L and SF-6DV2 HUSs are generally better than the EQ-VAS score. One potential reason is that the two HUSs are based on the information on the five or six health aspects while the EQ-VAS score reflects the global health of an individual which is insensitive to health impairment in a certain dimension. This is similar to the finding in depressed patients that SF-6D HUS had better sensitivity over EQ-VAS score [47]. With regard to the sensitivity of the two HUSs, we found that either of them is not superior to the other. Previous studies also reported inconsistent findings in general populations in Asia [17, 48]. Apart from the differences in study design, population, method (e.g., the order of two instruments), the finding could also be due to their scoring functions: the EQ-5D-5L function apts to yield lower HUS for severe health states offsetting the advantage in descriptive system of SF-6D (more dimensions).

The strength of our study is the randomized assignment of the two instruments thus avoiding the order effect [49]. Our study also has two limitations. First, it is a cross-sectional study thus the test-retest reliability and responsiveness cannot be assessed. Second, HRQOL were collected by participants completing the paper version of instruments online. This practice might have affected the quality of data. Nevertheless, the participants are highly-educated and familiar with internet use, which would ensure the validity and reliability of HRQOL to a large extent.


In conclusion, it appears that both EQ-5D-5L and SF-6DV2 HUSs have acceptable measurement properties including convergent validity, known-groups validity, sensitivity in Chinese university staff and students. However, only EQ-5D-5L (PD and AD) and SF-6DV2 (PN and MH) demonstrated the anticipated good convergent validity. Future studies are warranted to further evaluate other measurement properties such as test-retest reliability and responsiveness of the two instruments in the populations.

Data Availability

Data is not suitable for public deposition due to ethical concerns. Requests for data may be sent to the corresponding author: Wang Pei; Email Address:


  1. Lamu AN, Olsen JA. Testing alternative regression models to predict utilities: mapping the QLQ-C30 onto the EQ-5D-5L and the SF-6D. Quality of life research. Int J Qual life Aspects Treat care Rehabilitation. 2018;27(11):2823–39.

    Article  Google Scholar 

  2. Sun S, Chen J, Johannesson M, Kind P, Xu L, Zhang Y, Burström K. Population health status in China: EQ-5D results, by age, sex and socio-economic status, from the National Health Services Survey 2008. Qual life Research: Int J Qual life Aspects Treat care Rehabilitation. 2011;20(3):309–20.

    Article  Google Scholar 

  3. Liu GG. China guidelines for pharmacoeconomic evaluations 2020. Beijing, China: China Market Press; 2020.

    Google Scholar 

  4. Luo N, Wang P, Fu AZ, Johnson JA, Coons SJ. Preference-based SF-6D scores derived from the SF-36 and SF-12 have different discriminative power in a population health survey. Med Care. 2012;50(7):627–32.

    Article  PubMed  Google Scholar 

  5. Brooks R. EuroQol: the current state of play. Health Policy. 1996;37(1):53–72.

    Article  CAS  PubMed  Google Scholar 

  6. Poór AK, Rencz F, Brodszky V, Gulácsi L, Beretzky Z, Hidvégi B, Holló P, Kárpáti S, Péntek M. Measurement properties of the EQ-5D-5L compared to the EQ-5D-3L in psoriasis patients. Qual life Research: Int J Qual life Aspects Treat care Rehabilitation. 2017;26(12):3409–19.

    Article  Google Scholar 

  7. Herdman M, Gudex C, Lloyd A, Janssen M, Kind P, Parkin D, Bonsel G, Badia X. Development and preliminary testing of the new five-level version of EQ-5D (EQ-5D-5L). Qual life Research: Int J Qual life Aspects Treat care Rehabilitation. 2011;20(10):1727–36.

    Article  CAS  Google Scholar 

  8. Buchholz I, Janssen MF, Kohlmann T, Feng YS. A systematic review of studies comparing the Measurement Properties of the three-level and five-level versions of the EQ-5D. PharmacoEconomics. 2018;36(6):645–61.

    Article  PubMed  PubMed Central  Google Scholar 

  9. Zhu J, Yan XX, Liu CC, Wang H, Wang L, Cao SM, Liao XZ, Xi YF, Ji Y, Lei L, Xiao HF, Guan HJ, Wei WQ, Dai M, Chen W, Shi JF. Comparing EQ-5D-3L and EQ-5D-5L performance in common cancers: suggestions for instrument choosing. Qual life Research: Int J Qual life Aspects Treat care Rehabilitation. 2021;30(3):841–54.

    Article  Google Scholar 

  10. Brazier J, Usherwood T, Harper R, Thomas K. Deriving a preference-based single index from the UK SF-36 Health Survey. J Clin Epidemiol. 1998;51(11):1115–28.

    Article  CAS  PubMed  Google Scholar 

  11. Brazier JE, Mulhern BJ, Bjorner JB, Gandek B, Rowen D, Alonso J, Vilagut G, Ware JE, SF-6Dv2 International Project Group. Developing a New Version of the SF-6D health state classification system from the SF-36v2: SF-6Dv2. Med Care. 2020;58(6):557–65.

    Article  PubMed  Google Scholar 

  12. Lam CL, Brazier J, McGhee SM. Valuation of the SF-6D Health States is feasible, Acceptable, Reliable, and valid in a Chinese Population. Value in Health: The Journal of the International Society for Pharmacoeconomics and Outcomes Research. 2008;11(2):295–303.

    Article  PubMed  Google Scholar 

  13. Poder TG, Fauteux V, He J et al. Consistency between three different ways of administering the short form 6 dimension version 2. Value Health. 2019;22(7):837–42. 1016/j. jval. 2018. 12. 012.

  14. McDool E, Mukuria C, Brazier J. A comparison of the SF-6Dv2 and SF-6D UK Utility values in a mixed patient and healthy Population. PharmacoEconomics. 2021;39(8):929–40.

    Article  PubMed  PubMed Central  Google Scholar 

  15. Wu J, Xie S, He X, Chen G, Bai G, Feng D, Hu M, Jiang J, Wang X, Wu H, Wu Q, Brazier JE. Valuation of SF-6Dv2 Health states in China using Time Trade-off and discrete-choice experiment with a duration dimension. PharmacoEconomics. 2021;39(5):521–35.

    Article  PubMed  PubMed Central  Google Scholar 

  16. Shiroiwa T, Fukuda T, Ikeda S, Igarashi A, Noto S, Saito S, Shimozuma K. Japanese population norms for preference-based measures: EQ-5D-3L, EQ-5D-5L, and SF-6D. Qual life Research: Int J Qual life Aspects Treat care Rehabilitation. 2016;25(3):707–19.

    Article  Google Scholar 

  17. Kangwanrattanakul K. A comparison of measurement properties between UK SF-6D and English EQ-5D-5L and Thai EQ-5D-5L value sets in general Thai population. Expert Rev PharmacoEcon Outcomes Res. 2021;21(4):765–74.

    Article  PubMed  Google Scholar 

  18. Bharmal M, Thomas J 3rd. Comparing the EQ-5D and the SF-6D descriptive systems to assess their ceiling effects in the US general population. Value in Health: The Journal of the International Society for Pharmacoeconomics and Outcomes Research. 2006;9(4):262–71.

  19. Cunillera O, Tresserras R, Rajmil L, Vilagut G, Brugulat P, Herdman M, Mompart A, Medina A, Pardo Y, Alonso J, Brazier J, Ferrer M. Discriminative capacity of the EQ-5D, SF-6D, and SF-12 as measures of health status in population health survey. Qual life Research: Int J Qual life Aspects Treat care Rehabilitation. 2010;19(6):853–64.

    Article  Google Scholar 

  20. Zhao L, Liu X, Liu D, He Y, Liu Z, Li N. Comparison of the psychometric properties of the EQ-5D-3L and SF-6D in the general population of Chengdu city in China. Medicine. 2019;98(11):e14719.

    Article  PubMed  PubMed Central  Google Scholar 

  21. Sun CY, Liu Y, Zhou LR, Wang MS, Zhao XM, Huang WD, Liu GX, Zhang X. Comparison of EuroQol-5D-3L and short Form-6D utility scores in Family caregivers of Colorectal Cancer patients: a cross-sectional survey in China. Front Public Health. 2021;9:742332.

    Article  PubMed  PubMed Central  Google Scholar 

  22. Xie S, Wang D, Wu J, Liu C, Jiang W. Comparison of the measurement properties of SF-6Dv2 and EQ-5D-5L in a Chinese population health survey. Health Qual Life Outcomes. 2022;20(1):96.

    Article  PubMed  PubMed Central  Google Scholar 

  23. Liu C, Wang S, Shen X, Li M, Wang L. The association between organizational behavior factors and health-related quality of life among college teachers: a cross-sectional study. Health Qual Life Outcomes. 2015;13:85.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Ge C, Yang X, Fan Y, Kamara AH, Zhang X, Fu J, Wang L. Quality of life among Chinese college teachers: a cross-sectional survey. Public Health. 2011;125(5):308–10.

    Article  CAS  PubMed  Google Scholar 

  25. Liu X, Cao X, Gao W. Does low self-esteem predict anxiety among Chinese College Students? Psychol Res Behav Manage. 2022;15:1481–7.

    Article  CAS  Google Scholar 

  26. Kuczynski AM, Kanter JW, Robinaugh DJ. Differential associations between interpersonal variables and quality-of-life in a sample of college students. Qual life Research: Int J Qual life Aspects Treat care Rehabilitation. 2020;29(1):127–39.

    Article  Google Scholar 

  27. Yang X, Ge C, Hu B, Chi T, Wang L. Relationship between quality of life and occupational stress among staff. Public Health. 2009;123(11):750–5.

    Article  CAS  PubMed  Google Scholar 

  28. Lizana PA, Vega-Fernadez G. Teacher teleworking during the COVID-19 pandemic: association between work hours, work-Family Balance and Quality of Life. Int J Environ Res Public Health. 2021;18(14):7566.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Payakachat N, Gubbins PO, Ragland D, Flowers SK, Stowe CD. Factors associated with health-related quality of life of student pharmacists. Am J Pharm Educ. 2014;78(1):7.

    Article  PubMed  PubMed Central  Google Scholar 

  30. He F, Shen M, Zhao Z, Liu Y, Zhang S, Tang Y, Xie H, Chen X, Li J. Epidemiology and Disease burden of androgenetic alopecia in college freshmen in China: a population-based study. PLoS ONE. 2022;17(2):e0263912.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Zhang W, Xie S, Xue F, Liu W, Chen L, Zhang L, Wu J, Yang R. Health-related quality of life among adults with haemophilia in China: a comparison with age-matched general population. Haemophilia: The Official Journal of the World Federation of Hemophilia. 2022;28(5):776–83.

    Article  PubMed  Google Scholar 

  32. Luo N, Liu G, Li M, Guan H, Jin X, Rand-Hendriksen K. Estimating an EQ-5D-5L value set for China. Value in Health: The Journal of the International Society for Pharmacoeconomics and Outcomes Research. 2017;20(4):662–9.

    Article  PubMed  Google Scholar 

  33. Feng Y, Parkin D, Devlin NJ. Assessing the performance of the EQ-VAS in the NHS PROMs programme. Qual life Research: Int J Qual life Aspects Treat care Rehabilitation. 2014;23(3):977–89.

    Article  Google Scholar 

  34. Koo TK, Li MY. A Guideline of selecting and reporting Intraclass correlation coefficients for Reliability Research. J Chiropr Med. 2016;15(2):155–63.

    Article  PubMed  PubMed Central  Google Scholar 

  35. Kim SH, Kim HJ, Lee SI, Jo MW. Comparing the psychometric properties of the EQ-5D-3L and EQ-5D-5L in cancer patients in Korea. Qual life Research: Int J Qual life Aspects Treat care Rehabilitation. 2012;21(6):1065–73.

    Article  Google Scholar 

  36. Hunger M, Sabariego C, Stollenwerk B, Cieza A, Leidl R. Validity, reliability and responsiveness of the EQ-5D in German Stroke patients undergoing rehabilitation. Qual life Research: Int J Qual life Aspects Treat care Rehabilitation. 2012;21(7):1205–16.

    Article  Google Scholar 

  37. Bland JM, Altman DG. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet (London England). 1986;1(8476):307–10.

    Article  CAS  PubMed  Google Scholar 

  38. Thaweethamcharoen T, Noparatayaporn P, Sritippayawan S, Aiyasanon N. Comparison of EQ-5D-5L, VAS, and SF-6D in Thai patients on peritoneal Dialysis. Value in Health Regional Issues. 2019;18:59–64.

    Article  PubMed  Google Scholar 

  39. Mukaka MM. Statistics corner: a guide to appropriate use of correlation coefficient in medical research. Malawi Med Journal: J Med Association Malawi. 2012;24(3):69–71.

    CAS  Google Scholar 

  40. Wu J, Han Y, Zhao FL, Zhou J, Chen Z, Sun H. Validation and comparison of EuroQoL-5 dimension (EQ-5D) and short Form-6 dimension (SF-6D) among stable angina patients. Health Qual Life Outcomes. 2014;12:156.

    Article  PubMed  PubMed Central  Google Scholar 

  41. Ye Z, Sun L, Wang Q. A head-to-head comparison of EQ-5D-5 L and SF-6D in Chinese patients with low back pain. Health Qual Life Outcomes. 2019;17(1):57.

    Article  PubMed  PubMed Central  Google Scholar 

  42. Cheung PWH, Wong CKH, Cheung JPY. Differential Psychometric properties of EuroQoL 5-Dimension 5-Level and short-form 6-Dimension Utility measures in Low Back Pain. Spine. 2019;44(11):E679–86.

    Article  PubMed  Google Scholar 

  43. Søgaard R, Christensen FB, Videbaek TS, Bünger C, Christiansen T. Interchangeability of the EQ-5D and the SF-6D in long-lasting low back pain. Value in Health: The Journal of the International Society for Pharmacoeconomics and Outcomes Research. 2009;12(4):606–12.

    Article  PubMed  Google Scholar 

  44. Whitehurst DG, Bryan S. Another study showing that two preference-based measures of health-related quality of life (EQ-5D and SF-6D) are not interchangeable. But why should we expect them to be? Value Health. 2011;14(4):531–8.

    Article  PubMed  Google Scholar 

  45. Yao Q, Liu C, Zhang Y, Xu L. Population norms for the EQ-5D-3L in China derived from the 2013 National Health Services Survey. J Global Health. 2021;11:08001.

    Article  Google Scholar 

  46. Sun S, Chen J, Johannesson M, Kind P, Xu L, Zhang Y, Burström K. Population health status in China: EQ-5D results, by age, sex and socio-economic status, from the National Health Services Survey 2008. Qual Life Res. 2011;20(3):309–20.

    Article  PubMed  Google Scholar 

  47. Sakthong P, Munpan W. A Head-to-Head comparison of UK SF-6D and Thai and UK EQ-5D-5L value sets in Thai patients with chronic Diseases. Appl Health Econ Health Policy. 2017;15(5):669–79.

    Article  PubMed  Google Scholar 

  48. Turner N, Campbell J, Peters TJ, Wiles N, Hollinghurst S. A comparison of four different approaches to measuring health utility in depressed patients. Health Qual Life Outcomes. 2013;11:81.

    Article  PubMed  PubMed Central  Google Scholar 

  49. McColl E, Eccles MP, Rousseau NS, Steen IN, Parkin DW, Grimshaw JM. From the generic to the condition-specific? Instrument order effects in Quality of Life Assessment. Med Care. 2003;41(7):777–90.

    Article  PubMed  Google Scholar 

Download references


We thank all those who have participated in and helped with this study.


This study was funded by National Natural Science Foundation of China (Grant No. 72274037) awarded to Wang Pei. The authors declare that they have no competing interests in this work.

Author information

Authors and Affiliations



PW and H-JZ contributed to the study design. H-JZ prepared material and collected data. A-XZ conducted data analysis; and all authors contributed to the interpretation of data. A-XZ and PW drafted the first version of manuscript: all authors commented on previous versions of the manuscript. All authors reviewed and approved the final manuscript.

Corresponding author

Correspondence to Pei Wang.

Ethics declarations

Ethics approval and consent to participate

The collection of human data was conducted in accordance with the Declaration of Helsinki. The study was approved by the IRB committee of the Air Force Medical Center in Beijing (KongTe: NO 2021-169-PJ01). All participants gave informed consent to participate in the study.

Consent for publication

Not Applicable.

Conflict of interest

The authors declare no conflicts of interest.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhou, H.J., Zhang, A., Wei, J. et al. Psychometric performance of EQ-5D-5L and SF-6DV2 in measuring health status of populations in Chinese university staff and students. BMC Public Health 23, 2314 (2023).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: