Understanding the Social Construction of Health by Incorporating 1 Biomarker in the Contemporary China

Background: Researchers interested in the effect of health on various life outcomes often use self- 2 reported health and disease as an indicator of true, underlying health status. However, the validity 3 of reporting is questionable as it relies on the awareness, recall bias and social desirability. 4 Accordingly, the measured biomarker is generally regarded as a more precise indication of the disease. Objectives: The study aimed to examine the discrepancy between the reporting and biomarkers of hypertension and diabetes in the contemporary China, and explore sociodemographic 8 characteristics that are correlated with misreporting. Methods: Using data from the third wave of China Health and Retirement Longitudinal Study 10 (CHARLS), we selected individuals aged 40-85 years old who participated in both a health 11 interview survey and a biomarker examination. Sensitivity, specificity, false negative reporting 12 and false positive reporting were used as measurements of (dis)agreements or (in)validity. Binary 13 and multinomial logistic regression were used to estimate under-report or over-report of 14 hypertension and diabetes. Results: Self-reported hypertension and diabetes showed low sensitivity (71.98% and 49.21%, 16 respectively) but high specificity (93.71% and 98.05%, respectively). False positive reporting of 17 hypertension and diabetes were 3.85% and 1.67%, while false negative reports were extremely 18 high at 10.85% and 7.38%. Education degree, hukou, age and gender affected both the specific 19 error and the overall error of reporting hypertension and diabetes, but there were some differences 20 in the magnitude and direction. Conclusion: the disease burden of and diabetes. objective measurements survey allowing better understanding of socioeconomic inequalities in health, especially collecting biological 1 indicators for populations with limited access to regular healthcare in China. Furthermore, there is 2 an urgent need to provide basic health education and physical examination to citizens, to facilitate 3 access to healthcare and make focused interventions to lower the incidence and unawareness of 4 disease in China.


Introduction 8
Hypertension and diabetes are two well-known risk factors of cardiovascular disease, the 9 leading cause of death worldwide with 17.79 million deaths in 2017(Ritchie and Roser 2019). 10 Prevalence estimates of high-quality based on biomedical measurements of both diseases are 11 needed for monitoring cardiovascular disease risks and planning public health interventions and 12 prevention. Due to the high cost and long-time collection of biomedical data, economists and 13 demographers have relied heavily on self-reported disease of hypertension and diabetes to estimate 14 the prevalence and disease burden. However, recent research has raised doubts about the reporting 15 error of self-reported disease. Such error can occur for a number of reasons. For example, survey 16 respondents may report their disease differently depending on last specific behavior, socially 17 driven conceptions of what 'chronic disease' means, expectations of their own health, using of 18 healthcare, and comprehension of the actual survey questions (Murray and Chen 1992, Newell, 19 information, the final sample size about hypertension and diabetes were 14,457 and 12,189, 1 respectively. Characteristics of the sample are summarized in Table 1 ① . 2

Measures 3
Self-reported and biomedical measures of hypertension and diabetes 4 Self-reported data on hypertension and diabetes were obtained by the question, 'Have you 5 been diagnosed with hypertension / diabetes by a doctor?'. Each respondent who answered no, 6 was continued to ask "do you know whether you have hypertension by yourself" ② . If a respondent 7 answered in the affirmative for any of two questions, we defined the self-reported hypertension or 8 diabetes as 1, otherwise as 0. Biomedical blood pressure was measured three times (approximately 9 45 s apart) on a single occasion, using an electronic monitor. The average of these blood pressure 10 readings was used to determine each respondent's blood pressure level. Hypertension was defined 11 as a systolic blood pressure ≥140 mm Hg and/or a diastolic blood pressure ≥90 mm Hg and/or 12 current use of antihypertensive medication, following the WHO guideline (Alwan 2011); 13 Biomedical diabetes was measured by venous blood data which provided glycated hemoglobin 14 (HbA1c). The diagnostic criterion for diabetes in our study was defined as HbA1c values ≥6.5%. 15 If a respondent's glycated hemoglobin was over 6.5%, we defined the biomedical diabetes as 1, 16 otherwise as 0. Although HbA1c may not be the most widely used screening test, it has been 17 ① The distribution was very similar and no bias was found.
② The question was only asked about samples that had not been diagnosed with hypertension by a doctor.
[Xie] Page 9 [Insert Running title of <72 characters] suggested as an alternative means of screening for diabetes and has been used in this way in many 1 surveys (Bennett, Guo et al. 2007 the respondent held a rural hukou (0 = No, 1 = Yes). 10

Health behavior 11
Drinking was a 5-category variable indicating the frequency of drinking last year: none 12 (coded 1), less than 3 days/month (coded 2), less than 3 days/week (coded 3), 4 to 6 days/week or 13 daily (coded 4), twice a day or above (coded 5). Smoking was a continuous variable indicating the 14 frequency of cigarettes/day, which ranged from 0 to 100. 15

Analytic strategy 3
Our first step was to assess the difference in prevalence estimates based on two data collection 4 methods, the prevalence of hypertension and diabetes were calculated according to self-reported 5 information, as well as according to the results of biomedical measurements obtained from the 6 CHARLS. To assess the accuracy of self-reported data, sensitivity, specificity, false negative 7 reporting and false positive reporting were calculated, respectively. 8 Both only sensitivity and specificity were of no practical use when it came to helping the 9 clinician estimate the probability of disease in individual patients (Akobeng, Ramanan et al. 2006). 10 In addition, sensitivity and specificity assessed only individual errors in diagnosed or undiagnosed 11 diseases, respectively, but not overall errors. We identified both total error and the specific error 12 and assessed sociodemographic characteristics that are correlated with misreporting (sensitivity, 13 specificity, false negative reporting and false positive reporting), and binary and multinomial 14 logistic regression analysis were considered, controlling for education, hukou, drinking, smoking 15 age, gender and marital status. As the total error outcome (correct reporting, false negative 16 reporting and false positive reporting) has more than two categories. The model has 2 equations as 17 follows ③ : 18 ③ In order to assess the differences in communities and provinces, we also adopted random intercept model, the results showed that variations between communities and provinces were not statistically significant, so this article only presented the result of the common model.

Results 4
Sensitivity, specificity and false reporting of hypertension and diabetes 5 The prevalence of hypertension was 38.71% according to biomedical test and 31.72% 6 according to the self-reported data, indicating that self-reporting led to an underestimation of 7 hypertension by 18.06%. Likewise, the prevalence of diabetes was 14.52% according to the 8 biomedical data and 8.81% according to self-reports, indicating an underestimated prevalence of 9 diabetes by 39.3% from self-reported data. 10 Both the prevalence of self-reporting and objective hypertension and diabetes increased over 11 age in China, and biomedical hypertension and diabetes rose considerably faster with age than 12 self-reporting, which means the gap between self-reporting and objective hypertension increases 13 with age as shown in Figure 1 and 2. This is suggestive of undiagnosed hypertension and diabetes 14 becoming more of a problem with individuals' age. Since China has the greatest number of oldest 15 old adults in the world according to United Nations data, and undiagnosed high blood pressure and 16 diabetes may become common over time. 17 Table 2 provides the sensitivity, specificity and false reporting of self-reported hypertension 18 and diabetes compared with biomedical data. The overall sensitivity and specificity of self-19 reported hypertension were 71.98% and 93.71%, which meant 28.02% of people didn't know they 20 had hypertension, and 6.31% of people thought they had hypertension. The false reporting of 21 hypertension was 14.7%, specifically false positive reporting of hypertension was 3.85% and false 1 negative reporting was 10.85%. For diabetes, the overall sensitivity and specificity were 49.21 % 2 and 98.05%, which meant over 50% of people didn't know they had hypertension, and only less 3 than 2% of people thought they had diabetes. The false reporting of diabetes was 9.05%, 4 specifically false positive reporting of hypertension was 1.67% and false negative reporting was 5 7.38%, which meant less than 10% of people misreporting diabetes status. 6 Comparing the four indicators above of hypertension and diabetes, we found that the overall 7 miss reporting is different from the group error. Taken together, these results were suggestive of a 8 substantial public health problem of undiagnosed hypertension and diabetes in China. Note that 9 our use of self-reported hypertension and diabetes will increase the degree of underdiagnosis and 10 underestimate true disease burden in the population substantially. 11

Sociodemographic characteristics of sensitivity, specificity and false reporting 12
The estimated effects on our two sets of the discrepancy between self-reported hypertension 13 / diabetes and underlying biomarker using binary and multinomial logistic regression is shown in 14 Table 3 and 4. Columns 1 and 2 show the estimated effects of predictor variables on the outcome, 15 sensitivity and specificity between self-reported health and biomarker using binary logistic 16 regression. Columns 4 and 5-6 show the estimated binary and multinomial effects of predictor 17 variables on the reporting, but with reporting divided into three variables: correct reporting, false 18 negative and positive reporting. We transformed the coefficients into relative odds ratios by 19 exponentiation of the original coefficients. 20 First, we look into the reporting error for a particular group. Respondent characteristics 21 associated with sensitivity found that individuals with higher educational degree, urban hukou, 22 aged ≥50 years, female, and having healthy lifestyle, were strongly and independently associated with having more accurately self-reported hypertension than their counterparts among those with 1 real hypertension. The results in Table 3 suggest that no respondent characteristic was significantly 2 associated with more accurate reporting in specificity except for age group and healthy lifestyle, 3 where elderly people were more likely to erroneously report the absence of hypertension than those 4 younger than 60 years of age, and people who drank alcohol every day were more likely to report 5 errors among those without hypertension . However, smoking may more likely to correctly report 6 the absence of hypertension. 7 Next, we will identify the overall error. The likelihood of reporting errors decreases with 8 education, but not significantly. Urban hukou, no drinking, female, younger age and married 9 people were strongly and independently associated with correct reporting (Columns 3 of Table 3). 10 The coefficient for rural hukou was 0.877, meaning that compared with urban hukou, rural hukou 11 was a 12.3% increase in miss reporting. Compared with non-drinkers, people who drank every day 12 or more were more likely to report errors. Men were more likely to erroneously report than women. 13 Elderly people were more easily to erroneously reporting than those younger than 50 years of age, 14 and the error rate increased with age. Unmarried people were more prone to miss reporting than 15 those married people. Specifically, educational level had a significant effect on the risks of false 16 negative reporting but not significantly on false positive reporting (Columns 4-5 of Table 3). The 17 propensity to false negative reporting went down significantly with educational gradient. The 18 coefficients for primary education and secondary education and above were 0.875 and 0.829, 19 respectively, meaning that compared with the illiterate, primary education had a 12.5% decrease 20 in the rate of false negative reporting while secondary education and above had a 17.1% decrease. 21 Although people with higher educational attainment reported higher false positive than their 22 counterparts, the effect was not significant. Compared with the illiterate, people with primary education had a 11.6% increase in the rate of false positive reporting while having secondary 1 education had a 14.8% increase. Individuals, having a rural hukou, aged ≥50 years, male, and 2 having unhealthy lifestyle, are strongly associated with false negative reporting. False positive 3 reports were almost not related to sociodemographic characteristics. 4 For self-reported diabetes, education, hukou, drinking and age were associated factors with 5 sensitivity. Aged participants with higher levels of education, having an urban hukou and small 6 drinking were also more likely to accurately self-report diabetes than were their respective 7 counterparts among those with real diabetes. Multivariate analyses showed that younger people 8 with a rural hukou and small drinking had slightly more accurate reporting the absence of diabetes 9 than their counterpart among those without diabetes. 10 Our indicator of education had almost no statistically significant effect on false reporting of 11 diabetes, except that secondary education and above might slightly reduce false negative reporting 12 (Columns 3-5 of Table 4). Contrary to our expectations, false reporting between self-reported 13 diabetes and biomedical diabetes did not depend on educational level, which was similar to 14 previous research (Al Shamsi and Almutairi 2018). Educational level didn't affect the 15 disagreement of measurements for diabetes, although educational attainment was associated with 16 the prevalence of diabetes. Compared with people of urban hukou, those of rural hukou had a 18.6% 17 increase in the rate of correct reporting due to the lower prevalence of diabetes in rural area. Elderly 18 people were more prone to false reporting and false negative reporting than those younger than 50 19 years of age, and the rate went up significantly with age. To be specific, 50 to 59 years old people 20 had a 37.1% increase in false negative reporting while 60 to 69 years old had a 67.6% increase, 70 21 to 79 years old had an 96.7% increase and 80 and above years old had a 122.5% increase in false 22 negative reporting rate.
The results confirmed that education degree, hukou, age and gender affect both the specific 1 error and the overall error of reporting hypertension and diabetes, but there are some differences 2 in the magnitude and direction. 3 Discussion 4 Using data from China adults aged between 40 and above, we analyzed two diseases that 5 were commonly used in clinical evaluations of health-related risk, one of which was fitness 6 biomarker while the other was disease risk biomarker, and compared the discrepancy between self-7 reports and biomedical measurements. We found a large difference in the percentage of the sample 8 who reported having hypertension and diabetes (31.72% and 8.81%) relative to those who were 9 measured to have two diseases (38.71% and 14.52%). As the published report from China say, 10 solely on self-reported measures of disease will tend to underestimate the true extent of the disease 11 burden in contemporary China, and we do find self-reporting led to an underestimation of 12 hypertension and diabetes by 18.06% and 39.03%, respectively. Due to the more complex 13 collection methods and higher cost, we find that disease risk biomarker (diabetes) is more prone 14 to underestimation, which is really not conducive to disease assessment and intervention. 15 We also examined the sociodemographic characteristics that are correlated with misreporting 16 by using four indexes (sensitivity, specificity, false negative reporting and false positive reporting). 17 For hypertension, we find that education degree, hukou, age and gender affect both the specific 18 error and the overall error of reporting hypertension, but there are some differences in the 19 magnitude and direction. Educational attainment was an important explanatory factor, and had a 20 Besides, Hukou, as one of the most important redistributive institutions under Chinese state 2 socialism, had an effective on sensitivity and false negative reporting: rural hukou was more likely 3 to report error among those people with hypertension and have a false negative reporting. With 4 age, the rate of sensitivity went up and the rate of specificity went down. However, false reporting 5 increased with age. For diabetes, educational attainment was also an important explanatory factor, 6 and had a significant impact on sensitivity and false negative reporting, which meant the least 7 educated people were tend to underestimate their diabetes burden than their counterparts. Besides, 8 Hukou had an effective on sensitivity, specificity and correct reporting: rural hukou was more 9 likely to report error among those people with hypertension and correct report among those people 10 without diabetes. As a whole, rural hukou had a more correct reporting. Elderly people were more 11 prone to aware their diabetes, while less inclined to report their absence of disease. The false 12 reporting and false negative reporting went up significantly with age. 13 We draw three lessons from our results. First, our findings confirm the previous questions 14 that there is a big gap between self-reports and biomarker in China. China is a rapidly rising 15 developing country and is undergoing rapid population aging, which are generally associated with 16 and non-communicable and chronic conditions. Although a great progress of China in its inclusion 17 of biological and anthropometric measures of health in this and other surveys expands the 18 possibilities for biomarkers and social construction, underdiagnosis of disease is really common 19 in China. Considering the underdiagnosis of disease, the Chinese government should increase 20 awareness of disease and reassess the burden of disease. Second, self-reported data underestimate 21 the disease burden of hypertension and diabetes, and the underestimation of diabetes is greater. 22 understanding of socioeconomic inequalities in health. We underline the need to supplement 1 subjective health data with comprehensive and reliable biomedical measures where possible. 2 Objective measures of health, biomarkers are more valid measures of physiological function 3 "under the skin", meaning biosocial approaches to enhance the importance of social factors in the 4 biomedical process and to intervene in social conditions that cause inequity and avoidable inequity 5 will become increasingly important (Harris and Schorpp 2018). Third, there is an urgent need to 6 provide basic health education and physical examination to citizens, to facilitate access to 7 healthcare and make focused interventions to lower the incidence and unawareness of disease in 8 China. 9 A drawback of the paper is that we used cross-sectional data, which limits causal inferences. 10 The challenge of identifying causal effects remains universal in most science research, including 11 social stratification and health research. In this regard, longitudinal data with biomarker or genetics 12 are especially useful for sorting out causal effect. For example, having baseline biomarker 13 measures prior to some social exposure or self-assessment enables researchers to identify change 14 in that biomarker response to that exposure and explore whether and to what extent age trajectories 15 of self-reported health, biomarkers and their discrepancy, depended on the educational level. 16 Another limitation is biomedical measurement is an imperfect criterion. Problems arise when 17 availability of the biomarker is differentially related to either the disease or the exposure or when 18 the specimen acquisition, storage, measurement, or ascertainment procedures differ in those with 19 the disease compared to those without the disease or outcome of interest, and the most important 20 source of confounding is the failure to identify factors that may alter the measurement of the 21 biomarker, such as metabolic factors. If biological stability is not guaranteed, the accuracy of the 22 biomarker cannot be guaranteed. Future work should also consider additional data sources and repeated multiple biomarker measurements to complement survey data. 1 2

Conclusion 3
The prevalence of hypertension and diabetes are increasing over age in China, with many old 4 people remaining undiagnosed. Self-reported hypertension and diabetes showed low sensitivity 5 (71.98% and 49.21%, respectively) but high specificity (93.71% and 98.05%, respectively). False 6 positive reporting of hypertension and diabetes were 3.85% and 1.67%, while false negative 7 reports were extremely high at 10.85% and 7.38%. Education degree, hukou, age and gender affect 8 both the specific error and the overall error of reporting hypertension and diabetes, but there are 9 some differences in the magnitude and direction. As this is the first report of undiagnosed 10 hypertension and diabetes by using four indexes and evaluate sociodemographic characteristics 11 that are correlated with misreporting in China, the results confirm self-reported conditions 12 underestimate the disease burden. Adding objective measurements into social survey could 13 improve data accuracy allowing better understanding of socioeconomic inequalities in health. 14