Comparison of self-reports and biomedical measurements on hypertension and diabetes among older adults in China

Background Researchers interested in the effects of health on various life outcomes often use self-reported health and disease as an indicator of true, underlying health status. However, the validity of reporting is questionable as it relies on the awareness, recall bias and social desirability. Accordingly, biomedical test is generally regarded as a more precise indication of the disease. Methods Using data from the third wave of China Health and Retirement Longitudinal Study (CHARLS), we selected individuals aged 40–85 years old who participated in both health interview survey and biomedical test. Sensitivity, specificity, false negative reporting and false positive reporting were used as measurements of (dis) agreement or (in) validity, and binary and multinomial logistic regression were used to estimate under-report or over-report of hypertension and diabetes. Results Self-reported hypertension and diabetes showed low sensitivity (73.24 and 49.21%, respectively) but high specificity (93.61 and 98.05%, respectively). False positive reporting of hypertension and diabetes were 3.97 and 1.67%, while false negative reports were extremely high at 10.14 and 7.38%. Educational attainment, hukou, age and gender affected both group-specific error and overall error with some differences in their magnitude and directions. Conclusion Self-reported conditions underestimate the disease burden of hypertension and diabetes in China. Adding objective measurements into social survey could improve data accuracy and allow better understanding of socioeconomic inequalities in health. Furthermore, there is an urgent need to provide basic health education and physical examination to citizens, and promote the use of healthcare to lower the incidence and unawareness of disease in China.


Background
Hypertension and diabetes are two well-known risk factors of cardiovascular disease, the leading cause of death worldwide with 17.79 million deaths in 2017 [1]. Highquality estimates of prevalence based on biomedical measurements are needed for monitoring cardiovascular disease risks and planning public health preventions and interventions. Due to the high cost and long-time collection of biomedical data, economists and demographers have relied heavily on self-reported hypertension and diabetes to estimate their prevalence and disease burden. However, recent research has raised doubts about the reporting error of self-reported disease [2,3].
Many a study has attempted to assess the value of a self-reported disease by comparing self-reports with objective assessments, which has many advantages: precision of measurement, reliability, and less bias than questionnaires [4]. The criterion shows the discrepancy between self-reports and biomedical tests and is usually measured by sensitivity and specificity of the errors in a given group, as well as false reporting of the overall error. Sensitivity was defined as the percentage of respondents who reported hypertension/diabetes among those with biomedical hypertension/diabetes. This value was thus equivalent to hypertension/diabetes awareness among those with diseases; specificity was defined as the percentage of individuals who reported no hypertension/ diabetes among those with 'normal' biomedical measurements; false negative reporting was defined as those who reported no hypertension/diabetes but were diagnosed hypertension/diabetes, and false positive reporting was defined as those who reported hypertension/diabetes but were not diagnosed hypertension/diabetes. Evidence from developed countries indicated that there was a big gap between self-reported diseases and biomedical diseases of hypertension and diabetes and it may differ by socioeconomic groups. More educated and higher socioeconomic individuals might have a better understanding of health information and were more capable of answering survey questions on disease diagnosis [5][6][7][8].
Although developed countries have witnessed substantial disagreements between self-reports and objective diagnosis of the two diseases, the performance of developing countries were unclear due to the limited data. A handful of studies have investigated the incidence of disease in some local areas [9][10][11], but there is still a debate regarding the consistency and direction of the social gradient and health in developing countries [12], which means socioeconomic status might differ in developing countries compared to developed contexts. One possible explanation is that economic development takes place unequally across regions, and so does progress in the awareness of a disease. In addition, populations are exposed to different health-enhancing or health-damaging conditions. For example, the household registration (hukou) system divided China into two separated societies, with the majority of the population confined in the rural areas and entitled to fewer rights and benefits compared with urban residents.
With approximately 20% of the world's population, China is experiencing rapid population aging and an epidemiological transition moving from the primacy of acute, infectious and deficiency diseases to the increasing dominance of non-communicable and chronic conditions [12]. Report on Nutrition and Chronic Diseases of Chinese Residents (2015) shows that the prevalence of hypertension and diabetes in China has reached 25.2 and 9.7% over 18 years old, respectively. The lack of medical insurance and the high cost of the health care market may lead to incorrect perception of disease prevalence, which means under-diagnosis of chronic disease is not uncommon in China.
However, whether and to what extent the self-reported and biological hypertension and diabetes conflict is not clear in China. Meanwhile, diabetes mellitus causing the most death rises from the 19th in 1990 to 8th in 2017 and high systolic blood pressure was the leading risk factors contributing to deaths and Disability-Adjusted Life Year (DALY) in China [13]. Therefore, understanding the reality of the two diseases, estimating the burden of diseases and intervening the risk of death from diseases are very important in China.

Study design and setting
Data for this analysis are from China Health and Retirement Longitudinal Study (CHARLS), which is a nationally representative survey in China, designed by the National School for Development (China Center for Economic Research) together with the Institute for Social Science Survey at Peking University. The baseline wave of CHARLS was being fielded in 2011 and included about 10,000 households and 17,500 individuals in 150 counties/districts and 450 communities. The multistage sample was drawn at each stage based on probability proportional-to-size random-sampling procedures. The survey collected detailed demographic background, socioeconomic information, health status and functioning. The analysis draws on data of Wave 3 (2015), because it collected reliable venous blood samples which allow us to estimate the discrepancy between self-reported diseases and underlying biomarker levels among CHARLS respondents. Detailed information about the CHARLS blood sampling procedure and data quality management has been published previously [14]. All participants signed informed consent, and CHARLS was approved by the Ethical Review Committee of Peking University.
We restricted respondents aged 40 to 85 years old with valid self-reported diseases and biomedical tests. The Wave 3 (2015) included 20,284 respondents asked to consent to a venous blood draw and blood pressure measurement; 13,013 respondents provided venous blood and 16,406 respondents provided blood pressure information. Combined with those respondents who also provided detailed sociodemographic information, the final sample sizes of hypertension and diabetes were 14, 462 and 12,189, respectively. Characteristics of the sample are summarized in Table 1. 1

Self-reports and biomedical measurements of hypertension and diabetes
Self-reported data on hypertension and diabetes were obtained by the question, "Have you been diagnosed with [conditions listed below, read one by one] by a doctor?", and there are 14 options, which are "Hypertension, Dyslipidemia, Diabetes, Cancer or malignant tumor, Chronic lung diseases, Liver disease, Stroke, Heart problems, Kidney disease, Stomach or other digestive disease, Emotional, nervous, or psychiatric problems, Memoryrelated disease, Arthritis or rheumatism and Asthma". 2 If a respondent answered hypertension or diabetes, we defined the self-reported hypertension or diabetes as 1, otherwise as 0. Biomedical blood pressure was measured three times (approximately 45 s apart) on a single occasion, using an electronic monitor. We take the average of the last 2 readings, after excluding the first reading to avoid white coat hypertension. Hypertension was defined as a systolic blood pressure ≥ 140 mmHg and/or a diastolic blood pressure ≥ 90 mmHg and/or current use of antihypertensive medication, following the WHO guideline [15]; biomedical diabetes was measured by venous blood data which provided glycated hemoglobin (HbA1c). The diagnostic criterion for diabetes in our study was defined as HbA1c values ≥6.5%. If a respondent's glycated hemoglobin was over 6.5%, we defined the biomedical diabetes as 1, otherwise as 0. HbA1c is more expensive than the routinely conducted test, and may not be the most widely used screening test, however, it can be measured at any time of the day regardless of the duration of fasting or the content of the previous meal and it is a good predictor for both the micro-and macrovascular complications of diabetes. The additional benefits in predicting costly preventable clinical complications may make it a costeffective choice [16].

Economic resources
Educational attainment was measured by three levels. This variable indicated the highest educational degree attained by respondents at the survey point. The lowest category was individuals holding no formal education (illiterate); intermediate education ranged from not finishing primary school, home school to elementary school; and the highest category was respondents holding middle school or above degree. Hukou was measured by whether the respondent held a rural hukou (0 = No, 1 = Yes).

Health behaviors
Drinking was a 5-category variable indicating the frequency of drinking last year: none (coded as 1), less than 3 days/month (coded as 2), less than 3 days/week (coded as 3), 4 to 6 days/week or daily (coded as 4), twice a day or above (coded as 5).
Smoking was a continuous variable indicating the number of cigarettes/day, which ranged from 0 to 100.

Demographic characteristics
Gender was a binary variable: male (coded as 1), female (coded as 0). Age was a 5-category variable ranging from 40 to 49, 50-59, 60 to 69, 70-79, 80 and above. Marital status indicated whether the respondent was in a marriage status: separated, divorced, widowed and never married (coded as 0), married or partnered (coded as 1).

Analytic strategy
Our first step was to assess the difference in prevalence estimates based on two data collection methods, the prevalence of hypertension and diabetes were calculated according to self-reported information, as well as according to the results of biomedical measurements obtained from the CHARLS. We use the formula to calculate the degree of underestimation as follows.
To assess the accuracy of self-reported data, sensitivity, specificity, false negative reporting and false positive reporting were also calculated, respectively. Only sensitivity or specificity was of no practical use when it came to helping the clinician estimate the probability of disease in individual patients [17]. In addition, sensitivity and specificity assessed group-specific errors in diagnosed or undiagnosed diseases, respectively, but not overall errors. We identified both overall error and the group-specific error and assessed sociodemographic characteristics that are correlated with misreporting (sensitivity, specificity, false negative reporting and false positive reporting). Controlling for educational attainment, hukou, drinking, number of cigarettes/day, age, gender and marital status, binary and multinomial logistic regression analysis were applied. As the overall error outcome (correct reporting, false negative reporting and false positive reporting) has more than two categories. The model equations are set as follows 3 :

Results
Sensitivity, specificity and false reporting of hypertension and diabetes The prevalence of hypertension was 37.88% based on biomedical test and 31.72% based on self-reported data, indicating that self-reporting led to an underestimation of hypertension by 16.26%. Likewise, the prevalence of diabetes was 14.52% based on biomedical test and 8.81% based on self-reports, indicating an underestimated prevalence of diabetes by 39.32%. Both the prevalence of self-reports and biomedical hypertension and diabetes increased with age in China, however, biomedical hypertension and diabetes rose considerably faster with age than self-reporting, which meant the discrepancy between self-reports and biomedical hypertension increased with age as was shown in Figs. 1 and 2. This is suggestive of undiagnosed hypertension and diabetes becoming more of a problem increasing with individuals' age. Since China has the largest number of older adults in the world according to United Nations data, and undiagnosed high blood pressure and diabetes may become increasingly common over time. Table 2 provides the sensitivity, specificity and false reporting of self-reported hypertension and diabetes compared with biomedical tests. The overall sensitivity and specificity of self-reported hypertension were 73.24 and 93.61%, which meant 26.76% of respondents didn't know they had hypertension, and 6.39% of respondents falsely thought they had hypertension. The false reporting of hypertension was 14.11%: specifically false positive reporting of hypertension was 3.97% and false negative reporting was 10.14%. For diabetes, the overall sensitivity and specificity were 49.21 and 98.05%, which meant over 50% of respondents didn't know they had hypertension, and only less than 2% of respondents falsely thought they had diabetes. The false reporting of diabetes was 9.05%: specifically false positive reporting of hypertension was 1.67% and false negative reporting was 7.38%, which meant about 10% of respondents misreport diabetes status.
Comparing the four indicators above of hypertension and diabetes, we found that the overall misreporting error is different from the group-specific error. Taken together, these results were suggestive of a substantial public health problem of undiagnosed hypertension and  Sociodemographic characteristics of sensitivity, specificity and false reporting The estimated effects on our two sets of the discrepancy between self-reported hypertension/diabetes and their biomedical tests using binary and multinomial logistic regression were shown in Tables 3 and 4. Columns 1, 2 and 3 showed the estimated effects of predictor variables on sensitivity and specificity between self-reported conditions and their biomedical tests, and correct reporting using binary logistic regression models. Columns 4 and 5 showed the estimated effects of predictor variables on false negative and positive reporting using multinomial logistic regression models.

Hypertension
First, we looked into the group-specific error. Respondent characteristics associated with sensitivity found that individuals with higher educational attainment, urban hukou, aged ≥50 years, female, and having healthy lifestyle, were strongly and independently associated with having more accurate self-reported hypertension than their counterparts among those with biomedical hypertension. The results in Table 3 suggested that no respondent characteristic was significantly associated with more accurate reporting in specificity except for age and healthy lifestyle. Generally, older adults were more likely to erroneously report the absence of hypertension than those younger than 60 years old. People who drank alcohol every day were more likely to report errors among those without hypertension, however, those who smoked more frequently were more likely to correctly report the absence of hypertension. Next, we identified the overall error of hypertension. The likelihood of reporting errors decreased with educational attainment, but not significantly. Urban hukou, no drinking, female, younger age and married respondents were strongly and independently associated with correct reporting (Columns 3 in Table 3). Compared with urban hukou, rural hukou had a 10.4% increase in misreporting. Compared with non-drinkers, people who drank every day or more were more likely to report errors. Males were more likely to erroneously report than females. Older adults were more easily to erroneously report than those younger than 50 years of age, and the error rate increased with age. The unmarried were more prone to misreport than those married.
Educational attainment had a significant effect on the risks of false negative reporting but not significantly on false positive reporting (Columns 4-5 in Table 3). The propensity of false negative reporting went down significantly with educational attainment. Compared with the illiterate, primary education had a 13.2% decrease in the rate of false negative reporting while secondary education and above had a 15.7% decrease. Although people with higher educational attainment reported higher false positive than their counterparts, the effect was not significant. Individuals having a rural hukou, aged 50 years old and above, male, and having unhealthy lifestyle, were strongly associated with false negative reporting. False positive reports were almost not related to sociodemographic characteristics.

Diabetes
For self-reported diabetes, educational attainment, hukou, drinking and age were associated with sensitivity. Aged participants with higher levels of education, having an urban hukou and less drinking were also more likely to accurately self-reported diabetes than their counterparts among those with biomedical diabetes (Column 1 in Table 4). Multivariate analyses showed that younger respondents with a rural hukou and less drinking had slightly more accurate reporting on the absence of diabetes than their counterparts among those without diabetes (Column 2 in Table 4).
Our indicator of educational attainment had almost no statistically significant effect on correct reporting and false positive reporting of diabetes, except that secondary education and above might slightly reduce false negative reporting (Columns 3-5 of Table 4). Contrary to our expectations, false positive reporting between self-reported diabetes and biomedical diabetes did not depend on educational attainment, which was similar to previous research [18]. Compared with respondents having urban hukou, those having rural hukou had a 18.6% increase in the rate of correct reporting due to the lower prevalence of diabetes in rural area. Older adults were more prone to false reporting and false negative reporting than those younger than 50 years of age, and the rate went up significantly with age. To be specific, the false negative reporting of the respondents aged 50-59, 60-69, 70-79, 80 and above years old was 37.1, 67.6, 96.7, 122.8% higher than the reference group.
The results suggest that educational attainment, hukou, age and gender affect both the group-specific error and the overall error of hypertension and diabetes reporting, but there are some differences in their magnitude and directions.

Discussion
Using data from Chinese older adults aged over 40 and above, we analyzed two diseases that were commonly used in clinical evaluations of health-related risk, and estimated the discrepancy between self-reports and biomedical measurements. We found a large difference in the respondents who reported having hypertension and diabetes (31.72 and 8.81%) relative to those who were measured to have two diseases (37.88 and 14.52%). Depending on only self-reported disease might underestimate the true extent of the disease burden in China, and self-reporting led to an underestimation of hypertension and diabetes by 16.26 and 39.32%, respectively. Due to the more complex collection methods and higher cost, we find that diabetes is more prone to be underestimated, which is really not conducive to disease assessment and intervention. We also examined the sociodemographic characteristics correlated with misreporting by using four indicators Note: + p < 0.10; * p < 0.05; ** p < 0.01; *** p < 0.001; odds ratio estimates, logistic and multinomial logistic models, with z scores in parentheses (sensitivity, specificity, false negative reporting and false positive reporting). For hypertension, we found that educational attainment, hukou, age and gender affected both the group-specific error and the overall error, but there were some differences in their magnitude and directions. Educational attainment was an important explanatory factor for sensitivity and false negative reporting, with each additional gradient decreased in education reducing the probability of committing false negative reporting of hypertension, respectively. Besides, hukou, as one of the most important redistributive institutions in China, had an effect on sensitivity and false negative reporting: the rural adults were more likely to falsely report among those people with hypertension and have a false negative reporting. For diabetes, educational attainment was also an important explanatory factor for sensitivity and false negative reporting, which meant the least educated people might underestimate their diabetes status than their counterparts. Besides, hukou had an effect on sensitivity, specificity and correct reporting: the rural adults were more likely to falsely report among those people with diabetes and correctly report among those people without diabetes. Note: + p < 0.10; * p < 0.05; ** p < 0.01; *** p < 0.001; odds ratio estimates, logistic and multinomial logistic models, with z scores in parentheses The elderly were more prone to be aware of the diabetes, while less inclined to report their absence of diabetes. The false reporting and false negative reporting of diabetes went up significantly with age. We draw three lessons from our results. First, selfreported data underestimated the disease burden of hypertension and diabetes, and the underestimation of diabetes was even greater. Adding objective measurements to social survey could improve data accuracy and allow better understanding of socioeconomic inequalities in health. We underline the need to supplement subjective health data with comprehensive and reliable biomedical measurements. Biomarkers are more valid measures of physiological function "under the skin", meaning biosocial approaches to enhance the importance of social factors in the biomedical process and to intervene in social conditions that cause inequity and avoidable inequity will become increasingly important [19]. Second, we found that there was a big discrepancy between self-reported diseases and biomedical tests on hypertension and diabetes in China. China is undergoing rapid population aging, accompanied by non-communicable and chronic conditions. Although inclusion of biological and anthropometric measures of health in surveys expands the possibilities for biomarkers and social construction, underdiagnosis of disease is still common in China. Considering the underdiagnosis of disease, Chinese government needs to increase the awareness of disease and reassess the burden of disease. Third, there is an urgent need to provide basic health education and physical examination to citizens, and promote the use of healthcare to lower the incidence and unawareness of disease in China.
The challenge of identifying causal effects remains universal in most science research, including social stratification and health research. In this regard, longitudinal data with biomarker or genetics are especially useful for sorting out causal effects. For example, having baseline biomarker measures prior to some social exposure or self-assessment enables researchers to explore whether and to what extent age trajectories of self-reported health, biomarkers and their discrepancy, depended on the sociodemographic characteristics. A drawback of the paper is that we used cross-sectional data, which limits causal inferences of sociodemographic on the discrepancy. Another limitation is that biomedical measurement is imperfect. The most important source of confounding is the failure to identify factors that may alter the measurement of the biomarker, such as metabolic factors. Future work could also consider additional data sources and repeated multiple biomarker measurements to complement survey data.

Conclusions
The prevalence of hypertension and diabetes are increasing over age in China, with many old people remaining undiagnosed. Self-reported hypertension and diabetes showed low sensitivity (73.24 and 49.21%, respectively), but high specificity (93.61 and 98.05%, respectively). False positive reporting of hypertension and diabetes were 3.97 and 1.67%, while false negative reports were extremely high at 10.14 and 7.38%. Educational attainment, hukou, age and gender affected both the groupspecific error and the overall error of reporting hypertension and diabetes, but there were some differences in the magnitude and directions. As we know, this is the first report of undiagnosed hypertension and diabetes by using four indicators and evaluating sociodemographic characteristics that were correlated with misreporting in China, and the results confirm self-reported conditions underestimate the disease burden. Adding objective measurements into social survey could improve data accuracy and allow better understanding of socioeconomic inequalities in health.