Development and validation of the geriatrics health behavior questionnaire (GHBQ)

Background Considering the importance of health behaviors in health outcomes, it is necessary to assess health behaviors precisely. This study aimed to develop and validate The Geriatrics Health Behavior Questionnaire among Iranian older adults. Methods This cross-sectional and methodological study was conducted on 420 community older adults (age ≥ 60) through random multi-stage sampling. The initial questionnaire has been developed with 22 items and seven subscales based on an extensive literature review, evaluation of related questionnaires, and experts’ opinions. Face and content validity were evaluated by interviewing 10 older adults and 18 specialists. The construct validity was evaluated via Known-groups validity and convergent validity. The reliability of the questionnaire was calculated by internal consistency, test-retest, and absolute reliability. Results The face validity was conducted by using interviews with older adults and gathering the specialists’ opinions. The items were grammatically and lexically corrected accordingly. Two items were deleted due to CVR < 0.44. Modified Kappa statistic (K*) and I-CVI for all items were higher than 0.88. The average content validity index (S-CVI/Ave) value was 0.94. Three items were deleted to improve the internal consistency; the final GHBQ consisted of 17 items with Cronbach α = 0.72. Acceptable convergent validity was approved by a significant correlation between GHBQ and SF8™ health survey (r = 0.613, P value< 0.001). Independent t-test showed that older adults with education level ≥ high school have significantly higher health behavior scores than those with education level < high school (11.93 ± 2.27 vs. 9.87 ± 2.35, t = − 9.08, p < 0.001). Intra-class correlation coefficient (ICC) for the total questionnaire was 0.92 (95% CI =0.84 to 0.96). Standard Error Measurement (SEM) and Minimal Detectable Change (MDC95) were 0.71 and 1.98, respectively. Conclusion The present study results showed that the Geriatrics Health Behavior Questionnaire had suitable validity and reliability among Iranian older adults. It is recommended to consider its comprehensiveness and yet its briefness in other populations after passing validation. Supplementary Information The online version contains supplementary material available at 10.1186/s12889-022-12927-1.

Page 2 of 10 Bakhshandeh Bavarsad et al. BMC Public Health (2022) 22:526 (2010), is to enhance healthy behavior for all ages. Such plans help grow a population of older adults who have a healthy lifestyle [2]. A healthy lifestyle is a comprehensive concept that includes behaviors such as alcohol and tobacco use, Sexual activity, sedentary behavior, exercise, diet, stress management, medical adherence, and check-ups [3][4][5][6][7][8][9][10]. Row and Kahn (1997) showed that older adults in the successful aging category actively search to find a lifestyle that promotes their quality of life and health [11]. Previous studies showed that healthy behaviors play a key role in health and vitality.
A systematic review showed that a healthy lifestyle has potential benefits to protect cognitive health in later life and reduce the risk of dementia and cognitive decline [12]. Also, some pieces of evidence support the protective effects of healthy lifestyle behaviors on health outcomes such as better sleep quality, physical health [6], quality of life, the lower rate of stress, health distress, depression, frailty [13], diabetes [14], perceived hearing difficulties [15], and reduced mortality risk [7,8,16].
Considering the importance of health behaviors in health outcomes, a precise assessment of health behaviors is necessary. This necessity has led to the development of a wide range of instruments, some of which are general and others are age-specified. HELP-Screener is a 15-item questionnaire that has been developed for older adults. It is a valid, time-efficient, easy-tounderstand questionnaire with yes/no responses. It has included various aspects of a healthy lifestyle such as exercise, diet, socialization, leisure, and spirituality; however, this questionnaire has not separated dimensions [2]. The Health Promoting Lifestyle Profile II (HPLP-II) is a general questionnaire that included 52 items in six subscales (nutrition, physical activity, health responsibility, stress management, interpersonal relationships, and spiritual growth) [17]. It is a crosscultural questionnaire that has been translated into different languages, including Persian, Malaysian, Spanish, Brazilian, Chinese, and Turkish [18][19][20][21][22][23]. Tanjani et al. (2016) showed that HPLP-II is a suitable tool for assessing health behaviors among older adults [21], but Padula (1997) emphasized that some of the HPLP items are not appropriate for the older adults population [3]. Although HPLP-II is an appropriate tool and covers various dimensions of health behaviors, it is too long and is not user-friendly to assess older adults, especially when it is integrated with other questionnaires simultaneously. The Health-Promotion Activities of Older Adults Measure was designed by Padula in 1997 [3]. It is a 44-item instrument with five subscales: Collaborative Health Management/injury Prevention, Stress Reduction/Rest and Relaxation, Exercise, Substance Abuse Prevention, and Nutrition. This instrument's strength is that it has been designed specifically for older adults, but it is too long as HPLP-II.
Most of the studies use several items to assess health behaviors instead of a specific questionnaire (e.g., "How often in the past month did you eat (fresh) fruit?", 'Have you gained more than 10 pounds in the last 6 months, 'Do you do vigorous exercises for 15-30 min or more at least three times a week?' ,…) [8,[24][25][26][27][28] which their validity and reliability are not clear. This study aimed to develop and validate a practical, short, and simple instrument to cover all aspects of health behaviors for assessing Health Behaviors among older adults.

Study design
This is a cross-sectional, methodological study that was conducted in Tehran, Iran. Tehran is the capital of Iran, with various patterns of ethnicities, subcultures, and socio-economic levels.

Item generation
The concept and dimensions of health behaviors have been drawn out by a deductive approach. The initial questionnaire has been developed based on an extensive literature review, evaluation of related questionnaires, and experts' opinions. Literature was searched on the different databases, including Scopus, PubMed, Web of Science, and Google scholar, with the following keywords: health behavior, lifestyle, health promotion, older adult, aging, and elderly. We did not limit the search to a specific time interval. The instruments and items used to assess health behaviors among older adults in different studies were extracted and then assessed by the research team. The following questionnaires were used for developing the items of GHBQ: HPLP-II [17], Health Promotion Activities of Older Adults Measure [3], Morisky Medication Adherence Scale (MMAS) [29]. The selected items by the research team were sent to the scholars of gerontology, geriatrics, nursing, psychology, and social sciences, and their comments were gathered, assessed, and implemented on the questionnaire. Experts' opinions helped us to select and arrange appropriate dimensions and items based on their experience.
The final Geriatrics' health behavior questionnaire (GHBQ) includes 17 items on seven dimensions: physical activity (1 item with 2 parts), nutrition status (2 items), medication adherence (4 items), stress management (4 items), smoking, and alcohol consumption (2 items), sleep quality (2 items) and medical check-ups (2 items). GHBQ items are scored between 0 and 1 based on the accumulation approach as previously used to create the frailty index [30]. Total questionnaire scores range between 0 and 17, and higher values show better health behaviors. The final 17-items questionnaire and the scoring method are presented in Appendix 1.

Content related validity
In this study, the content-related validity includes face validity and content validity.

Face validity
Both Qualitative and Quantitative face validity were evaluated in this study. We interviewed 10 older adults selected by the convenience sampling method to evaluate items' difficulty, relevancy, and ambiguity. Then the research team assessed the interviews and all of the comments implemented to edit statements. Impact score formula (Frequency (%) × Importance) was used to evaluate quantitative face validity. Items with an impact score ≥ 1.5 were considered appropriate for more analysis [31].

Content validity
The qualitative content validity was used to assess the statements according to grammar, wording, item allocation, scaling, clarity, and simplicity. Eighteen specialists consisting of gerontologists, geriatricians, geriatric nurses, psychologists with a Ph.D. degree were asked to provide feedback to edit and revise the statements.
Quantitative content analysis was applied by two indexes, namely content validity ratio (CVR) and content validity index (CVI). CVR and CVI indices assess the necessity and relevancy of the Items, respectively. In the present study, relevancy is considered as the only index to evaluate CVI, according to Polit et al. 's suggestion [31,32].
To calculate the CVR index, first, each item was rated by experts according to a 3-point scale from not necessary to essential. Then the CVR index was estimated by using this formula: CVR = (nE − N/2)/(N/2), where nE signifies the number of panelists indicating "essential" and N is the total number of panelists. The accepted CVR value, based on the critical value of Lawshe's table and the number of subject matter experts (18 experts), was considered > 0.444 [33].
Content validity index was assessed for each item, which was answered on a 4-point scale from not relevant to completely relevant. I-CVI was calculated by dividing the number of experts giving a rating of '3' or '4' to each statement by the total number of experts. An I-CVI score over 0.79 was considered adequate. Scale-CVI was calculated based on S-CVI/Avg, where the sum of I-CVI was divided by their numbers. Based on Polit's opinion, acceptable S-CVI/Avg is equal to or more than 0.9.
The study used modified kappa statistics to evaluate the chance agreement between several raters. So the present study used the modified Kappa statistic (K*), which was designed by Polit et al. K* > 0.74 is considered excellent [31]. The probability of chance agreement was first calculated using the following formula: where N signifies the number of experts and A is the number of experts who agreed that the item was relevant. Then having calculated I-CVI for all items, kappa was computed by the following formula:

Sampling
This study selected 420 eligible community-dwelling older adults (age ≥ 60 years, with adequate cognitive functioning and ability to communicate) by multi-stage cluster sampling. At first, 22 districts of Tehran were classified into five groups in terms of socio-economic development levels, from developed areas to underdeveloped (very poor) areas [34]. One district in each cluster and then two regions in each district were selected randomly. The sample size in each district was determined based on its population ratio to the overall population. In terms of Intra-class correlation coefficient (ICC), 30 older adults were selected to complete the questionnaire twice, 2 weeks apart. All questionnaires have been completed by using door-to-door surveys.

Construct validity
The Geriatrics' Health Behavior Questionnaire construct validity was evaluated via Known-group validity and convergent validity.

Convergent validity
Convergent validity was assessed via the associations with other measures, including health-related quality of life (SF8 ™ ), related to health behaviors. Most of the studies report a significant relationship between health behaviors and quality of life; also, several studies showed the effect of health behavior interventions on quality of life [13,[35][36][37][38][39][40]. So, the present study used the SF8 ™ health survey [41] [43]. The Pearson correlation coefficient test was used to evaluate the relationship between SF8 ™ health survey and our Geriatric Health Behavior Questionnaire.

Known-group validity (discriminative validity)
The relation between education and health behaviors is explained by three behavioral effects. The first effect is that a person with a higher level of education can utilize resources more efficiently in order to gain healthier productions. The second one regards subjective variables such as time preference. Being aware of future effects of activities is the third behavior. This means that well-educated people live more healthily. For example, knowing the consequences of smoking will cause them to avoid this unhealthy behavior [44].
Healthier lifestyles are chosen more often by educated people based on their knowledge regarding the relationship between health behaviors and health outcomes [45]. Older adults in this study were divided into two groups based on education level (<high school, ≥high school). Then health behaviors were compared between groups using an independent t-test.

Reliability
Internal consistency was assessed by Cronbach's alpha, Kuder-Richardson for subscales with dichotomous choices [46], and Spearman-Brown coefficient for twoitem subscales based on Eisinga et al. results [47]. Internal consistency for the total GHBQ was assessed by Cronbach's alpha (values above 0.7 were considered acceptable) [48] and McDonald omega coefficients [49]. Kirk and Miller (1986) stated that the reliability of a questionnaire could be determined by measuring the value of the correlation between the scores of each item and the total score (item-total correlation) [50], and the value more than 0.25 is considered high based on Nunally and Bernstein (1994) study [46]. Also Cohen (1988) classified the values into three categories, small (0.10 to 0.29), medium (0.30 to 0.49) and high (0.50 to 1.00) [51]. The Pearson correlation coefficient test was used to evaluate the correlation value of each subscale Score with Total Score.
Intra-class correlation coefficient (ICC) with the twoway mixed-effects model and absolute agreement (95% confidence level) was calculated to evaluate test-retest reliability at scale level. ICC values above 0.75 were considered acceptable [52]. Also, test-retest reliability at the item level (Cohen's (1960) Kappa coefficient) was calculated for binary items [53]. Landis and Koch guidelines (1977) were used to interpret the results. The values from 0.0 to 0.2 indicate slight agreement, 0.21 to 0.40 indicate fair agreement, 0.41 to 0.60 indicate moderate agreement, 0.61 to 0.80 indicate substantial agreement, and 0.81 to 1.0 indicate almost perfect or perfect agreement [54].
Standard Error Measurement (SEM) and Minimal Detectable Change (MDC 95 ) were used to evaluate absolute reliability by the following formula: Minimal detectable change is the minimum amount of change that must be observed to be considered a real change. Thus, a change in scores smaller than MDC 95 can be related to measurement error [55].

Ceiling and floor effects
Ceiling and floor effects show the content validity is not appropriate and occur when more than 15% of participants choose responses at the higher and the lower end of the scale, respectively. The ceiling and floor effects were calculated as a percentage for all data [56].
The following descriptive and analytical indices were used in the present study by SPSS 22 software; Cronbach's alpha, Kuder-Richardson, Spearman-Brown coefficient, Pearson correlation coefficient test, intraclass correlation coefficient, independent t-test.

Result
A total of 420 older adults consisting of 224 men (53.3%) were recruited in this study. Participants' age ranged from 60 to 93 years, with a mean of 69.03 ± 7.61 years. Other demographic details of the participants are shown in Table 1.

Content related validity Stage 1: face validity
A total of 10 older adults between the ages of 60-76 were interviewed. Roughly 30% of participants were illiterate, 40 and 30% had diplomas and bachelor's degrees, respectively. Fifty percent of the samples were female, and all of them were married. The impact score of all items was higher than 1.5, so no item was removed in this phase. Some items were reviewed based on older adults' comments (e.g., physical activity was replaced by exercise and "when I am stressed, I often try to relax with listening to music, talking to someone, gardening or…" were replaced by "When I am stressed, I often try to do something such as listening to music, talking to someone, gardening or…. ").

Stage 2: content validity
We assessed both qualitative and quantitative content validity. The result of CVR showed that two items had a value < 0.44, so both were deleted. Finally, 20 items remained to assess the content validity index. Modified Kappa statistic (K*) and I-CVI for all of the items were higher than 0.88. The average content validity index (S-CVI/Ave) value was 0.94, which showed that the total questionnaire had appropriate content validity. Also, we revised 9 items based on grammar and wording according to the experts' panel suggestion.

Convergent validity
Convergent validity was evident based on a significant correlation between GHBQ and SF8 ™ health survey (r = 0.613, p < 0.001). Also, the Pearson correlation coefficient test showed that there was a significant correlation between GHBQ and physical component (r = 0.598, p value< 0.001) and mental component of SF8 ™ (r = 0.510, p value< 0.001) respectively.

Known-group validity (discriminative validity)
Independent t-test showed that older adults with education level ≥ high school have more healthy behaviors than those with education level < high school (11.93 ± 2.27 vs. 9.87 ± 2.35, t = − 9.08, p < 0.001). The results showed significant differences between the two groups in all of the subscales except smoking and alcohol consumption ( Table 2).

Reliability
In this step, three items (one item of nutrition status and two items of stress management) were deleted to improve the internal consistency; the final GHBQ consisted of 17 items with seven subscales. The internal consistency of the total questionnaire was 0.72, and McDonald's Omega was 0.714. The values were evaluated for each subscale presented in Table 3 that showed moderate to good  internal consistency. Cronbach α has not been assessed for the physical activity subscale because of having only one item. Pearson Correlation coefficient test showed a medium to high correlation value of the subscales scores with the total score of questionnaire except for the subscale of Smoking and alcohol consumption that the value was small. Correlation values of physical activity and sleep quality subscales with the total score were (r = 0.47 and 0.60 p < 0.001), respectively (Table 3). Item-total (biserial) correlations showed that there is a significant correlation between each item and the total score. The biserial correlation values were between 0.13 and 0.69 (p < 0.01), where the minimum value belongs to the item about alcohol consumption, and the maximum belongs to medication adherence. Based on Cohen's (1988) classification [51], six items had a high correlation, 6 had a medium correlation, and 5 had a small correlation with the total score. The test-retest method and the Intraclass correlation coefficient (ICC) were used to calculate the stability of the questionnaire. ICC for the total questionnaire was 0.92 (95% CI =0.84 to 0.96) ICC for the physical activity was 0.98 (95% CI =0.97-0.99) ( Table 3). Kappa coefficients for binary items, including Medication adherence and check-ups, were 0.43 to 1, indicating moderate to perfect agreement.
Standard Error Measurement (SEM) and Minimal Detectable Change (MDC 95 ) were 0.71 and 1.98, respectively. The ceiling and floor effects for the total questionnaire were 0%.

Discussion
Various tools have been designed to assess healthy lifestyles and health behaviors [2,3,17]. Some of these questionnaires have also been validated in the older-adults population [21]. Although psychometric assessment of questionnaires is common in various age groups, due to the specific characteristics of older adults, it is recommended to design questionnaires with simple, short, and clear items for this age group. The number of items is one element that requires consideration when designing questionnaires for older adults. Too many items put a burden on the respondents. Older adults usually become tired of answering long items sooner than other age groups. A majority of studies about older adults' health behaviors have replaced questionnaires with a collection of questions [8,[57][58][59][60]. However, the main limitation of these questions is the vagueness of their validity and reliability. The current study has tried to develop a short questionnaire suitable for the senior community, considering objective questions and covering different dimensions. The current questionnaire includes physical activity, nutrition status, medication adherence, sleep quality, stress management, check-ups, smoking, and alcohol consumption. This study is the first step out of the lengthy process of psychometric validation, and it primarily focuses on content and construct validity.
Generally, Questionnaires designed to study health behaviors ignore medication adherence, while this dimension is of great importance in the older-adults population. Medication adherence is more important among older adults than other age groups due to multiple chronic diseases from which they usually suffer [61]. Therefore, it should be considered in health behaviors. In the current questionnaire, one dimension has been dedicated to this concept. Most studies point to breast examination, Pap smear, and prostate test for screening [25,62,63]. However, based on the United States Preventive Services Taskforce (USPSTF) guidelines, the PSA test, selfbreast examination, and Pap smear during older ages are considered in grade D and not recommended for older adults [64]. It appears that having questions about these behaviors is not suitable for evaluating health behaviors. Thus, general annual physical exams and dental checkups have been used in the current study.
Another dimension focused on in this questionnaire was sleep quality. Sleep is important at all ages; however, it is highly crucial in older ages due to older adults' problems, such as restless leg syndrome, taking various medicines, and environmental changes [65]. Sleep disorder is recognized as a geriatrics syndrome [64]. Complications of poor sleep can cause irreparable harm to older adults' health, such as fractures [66]. Two items in the present questionnaire investigated both the quality and quantity of sleep in older adults. Sleep duration of 7-8 h [8,57] or 7-9 h [55,56] is considered normal for older adults. National sleep foundation advises between 7 and 9 h of sleep per night for adults aged 26-64 and 7-8 h of sleep for 65 years and older [67]. Since the current study has been conducted in a developing country, old age was considered over 60. As a result, the optimal sleep duration was decided to be 7-9 h. In the case of using this questionnaire for older adults of 65 years old and above, it is recommended that the range of 7-8 h is considered the optimal sleep duration.
Based on health department guidelines, physical activity of 150 min or more per week is recommended for older adults [68]. Hence, in the present questionnaire, this objective criterion was used to assess older adults' physical activity by simply asking two questions (Appendix 1).
Regarding healthy nutrition, there are several factors, such as proper consumption of salt, oil, fish, and vegetables. According to the World Health Organization and Food and Agriculture Organization, a minimum of five portions of fruit and vegetables per day is suggested [69]. Most studies consider daily consumption of fruit and vegetables as a suitable criterion for a healthy lifestyle [57,58,60,70,71]. In this questionnaire, examples have been given to older adults to facilitate understanding fruit and vegetable portions. When designing a questionnaire, both stems of the questions and response options are critical. A large variety of choices leads to better data collection; however, it may be exhausting or confusing to older adults. Previous experience shows that older adults resisted answering multiple-choice questions. This may be due to their difficulty in information retrieval and unfamiliarity with such questions [72]. In the current questionnaire, the response options, tailored to each item, are designed to cause the least fatigue and complexity for older adults and minimize response time. For example, short-answer questions have been used regarding physical activity and sleep. Also, two-choice yes/no questions have been used regarding medication adherence and screening.
Various methods were used to estimate the validity of the designed scales. For instance, face validity was used to simplify the sentences and make them understandable; the sentences were edited based on the comments from older adults and experts. The modified Kappa statistic and content validity index results indicated that the Geriatrics Health Behavior Questionnaire had a desirable content validity. Subscales score had a moderate to high correlation with the total score, indicating proper validity of the questionnaire. Known-group and convergent validity were also used to estimate the construct validity. The results proved that the current questionnaire not only had a good correlation with health and its dimensions (mental and physical health), but it was also able to differentiate between groups (educated versus uneducated). Although the results of our study showed no significant difference between educated and uneducated individuals in terms of alcohol consumption and smoking, it can be seen that there are similar cases in previous studies, such as in the study of Pärna K et al. (2014) [73]. In addition, different results were reported in various cohorts so that there was no relationship between education and smoking in the population of men in 1990-1994 and women in 1990-2000 [73].
In the present study, all three types of reliability were examined: the current questionnaire has very desirable reliability. The rate of absolute reliability based on standard error measurement was calculated below 10% of the total score (0.71), indicating desirable reliability. Moreover, the minimum detectable change was determined 1.98 for this questionnaire, indicating that equal or greater changes with a 95% confidence level are not due to measurement error or differences in how the study is conducted.
Although the Cronbach's alpha of over 0.7 is considered desirable, values higher than 0.5 are acceptable, according to George and Mallery (2003) [74]. The whole questionnaire had a desirable Cronbach's alpha with acceptable internal consistency based on the results obtained. Although the subscale's values obtained are not satisfactory, the results of Cronbach's alpha should not be interpreted without considering the number of questions. One reason for low Cronbach's alpha may be the low number of items in each dimension [75,76].
The current study has several strengths and weaknesses. Some strengths of this study are as follows: a random sampling of the community-dwelling older adults, while different districts of the city in terms of development were considered in order to reach a representative sample; applying various methods of validity and reliability, such as the face, content, and construct validity, examining internal consistency and absolute reliability; designing a questionnaire with minimum items and maximum dimensions. The following can be mentioned as the weaknesses of the current study: due to the type of items and the range of different answers, performing exploratory factor analysis was impossible; the sample studied here was only the community-dwelling older adults who were not affected by cognitive disorders. Consequently, another psychometric assessment is necessary to enable the use of this questionnaire in other groups.

Conclusion
Considering dimensions of health behaviors, which are of particular importance among the older adults, the questionnaire designed in the present study has created simple, short items to increase the answering pace and minimize the respondents' burden. The results of this study revealed that the Geriatrics Health Behavior Questionnaire had good validity and reliability for use among Iranian community-dwelling older adults.