An exploratory model for the non-fatal drowning risks in children in Guangdong, China

Background Drowning is a leading cause of accidental death in children under 14 years of age in Guangdong, China. We developed a statistical model to classify the risk of drowning among children based on the risk factors. Methods A multiple-stage cluster random sampling was employed to select the students in Grades 3 to 9 in two townships in Qingyuan, Guangdong. Questionnaire was a self-reported measure consisting of general information, knowledge, attitudes and activities. A univariate logistic regression model was used to preliminarily select the independent variables at a P value of 0.1 for multivariable model. Three-quarters of the participants were randomly selected as a training sample to establish the model, and the remaining were treated as a testing sample to validate the model. Results A total of 8390 children were included in this study, about 12.18% (1013) experienced drowning during the past one year. In the univariate logistic regression model, introvert personality, unclear distributions of water areas on the way to school, and bad relationships with their classmates and families were positively associated with drowning. However, females, older age and lower swimming skills were negatively associated with drowning. After employing the prediction model with these factors to estimate drowning risk of the students in the testing samples, the results of Hosmer-Lemeshow tests showed non-significant differences between the predictive results and actual risk (χ2 = 5.97, P = 0.65). Conclusions Male, younger children, higher swimming skills, bad relationship with their classmates and families, introvert personality and unclear distributions of water areas on the way to school were important risk factors of non-fatal drowning among children. The prediction model based on these variables has an acceptable predictive ability.


Background
Drowning in children is a serious public health concern around the world, with an average of 372,000 death from drowning every year [1]. Drowning is a leading cause of accidental death in children under 14 years of age in China [2,3] and in Guangdong Province located in southern China [4].
Although a series of intervention measures has been implemented in China, drowning in children remains an important public health issue [5,6]. Most of the measures focused on risk factors of children and environment [1,[7][8][9], few studies have incorporated the characteristics of high risk children into the prevention strategy. We thus conducted this study to establish an exploratory model for predicting drowning risk among children.
We conducted this study in two rural townships in Qingyuan City of Guangdong Province because of the high mortality rate due of drowning and abundant natural and man-made open water bodies (unprotected nature or man-made water bodies). The findings from this study will provide important insights to the future prevention and control of drowning among children.

Subjects selection
Students were selected using multiple-stage cluster random sampling [10] in Guangdong Province in 2013. We randomly selected one city (Qingyuan City) from the overall 21 cities in Guangdong Province, and two townships from Qingyuan city because of numbers of rivers, ponds and reservoirs in the two townships. All the students in Grades 3 to 9 in the two townships were included in this survey. A total of 8966 students aged between 8 and 18 years were selected to participate in this study. All the investigators were trained prior to the survey to assist students completing the questionnaire in the classroom.

Data collection and ethics considerations
This study was approved by the Ethics Review Committee of the National Center for Chronic and Noncommunicable Disease Control and Prevention, Chinese Center for Disease Control and Prevention (No: 201318). Agreement was obtained from every participant.

Questionnaire
The questionnaire used in this study was developed based on our previous project of integrated intervention for prevention of non-fatal drowning in children in Guangdong Province, 2006-2008 [8,11]. The variables included in the questionnaire were determined based on the literature and the current knowledge on the risk factors of drowning [12,13]. The non-fatal drowning was defined, in accordance with the World Health Organization [1], as the experience of respiratory impairment from submersion/immersion in liquid among the alive students during the past one year before the survey.
We collected the potential drowning-related risk factors including general information (such as age, gender, etc), education levels of their parents, relationships between their family members, their own swimming skill levels, risk perception on drowning, high-risk behaviors (such as swimming in the pond, playing by the river, etc), environment (such as distance from school to open water, distance from home to open water, have open water on the way to school, etc) and disease burden on drowning (the cost of treating drowning, etc) [8,9,11,14]. All the variables have been described in Table 1. All the questionnaires were checked by the investigators.

Statistical analysis
The study participants were randomly divided into training and testing groups. The distributions of age, gender, and prevalence of drowning was similar in the two groups. We used training group to establish the model, and testing group to evaluate the model. Training group including three-quarters of all subjects was used to examine the significant variables of drowning and establish the prediction model. Testing group was used to evaluate the predictive effect of the model. Univariate logistic regression model was used to establish a predictive model by selecting significant variables of drowning. The receiver operating characteristic (ROC) curve and Hosmer-Lemeshow (H.L) Chi-square test were employed to test the sensitivity and specificity of the prediction model.
(1) Variable selection: We used univariate logistic regression model to examine the relationship between the potential risk factors and drowning, and select significant variables at a P value of 0.1 for multivariable logistic regression model [15,16]. The prediction model was established based on the significant variables in the multivariable Table 1 Definitions and values of variables   [17]. All the statistical analyses were conducted using the Stata software (version 13.0).

General characteristics of study subjects
A total of 8390 students were initially recruited in this study, after excluding 73 students due to the missing information, with 4367 males and 3950 females. The mean age was 12.43 ± 1.84 years. The risk of exposure to water activities (such as playing in or around open water) was 48.74%. A total of 1013 (12.18%) students experienced non-fatal drowning in the one year before the survey.

Univariate analysis on the risk factors of drowning in children
Younger students, males, higher swimming skill, introvert personality, unclear distributions of water areas on the way to school, bad relationships with classmates and family were significantly associated with drowning risk in the univariate regression analyses (Table 2).

Multivariate analysis on the risk factors of drowning in children
Drowning risk was significantly lower in females, students with older age and lower swimming skills. However, introvert personality, unclear distributions of water areas on their way to school, bad relationships with their classmates and families could significantly increase drowning risk in the multivariate analyses. In the final identified regression model, trend tests were also performed by including ordinal variables as continuous measures ( Table 3). The prediction model for drowning risk in children was established based on the variables of multivariate analysis.
The drowning probability of 0.15 was set as a cutoff point. Students with drowning probability equal to or larger than 0.15 were defined as high drowning risk group, and students with drowning probability lower than 0.15 were categorized as low drowning risk group (Fig. 1).

Predictive results and evaluation of the prediction model of drowning risks
We employed the prediction model to estimate drowning risk of the students in the testing group, and further tested the predictive results. The results of Hosmer-Lemeshow tests showed non-significant differences between the predictive results and actual risk (χ 2 = 5.97, P = 0.65), indicating a high goodness of fit of the model. The sensitivity of the model was 49.44%, specificity was 74.76%, and the AUC was 0.680 (95% CI: 0.66~0.78), indicating that the model fits the data well (Fig. 2).

Discussion
A number of epidemiological studies have revealed that drowning in children was affected by various risk factors including individual (male, decreasing age, etc), family (poverty, the education of their parents, etc) and social factors (less policies, abundant bodies of water, etc) [5,8].
Although considerable efforts have been made to prevent drowning in children by Chinese governments, for example, the education department had released relevant documents, and developed education materials in the past years, drowning is still an important concern in children.
In this study, we established a prediction model based on the risk factors of drowning in children.
The established model could divide students into high-risk and low-risk groups according to the drowning risks in children. We should pay more attention to the high-risk group when we run integrated intervention for prevention of children drowning. The multivariate regression model is a widely-used method to establish a risk prediction model [18]. It can use cross-sectional investigation data to explore the population level risks, and can also predict the individual level risk of a certain disease using the cohort study data. The dependent variable in the regression model can be a binary variable, and independent variables can be either continuous variables or categorical variables. It is currently very common to establish the mathematical model based on both unconditional logistic regression model and long-term follow-up study data [19]. The earliest prediction model predicting the 10-year risk of diabetes in non-diabetes patients was established by Framingham Offspring who used the unconditional logistic regression model [20]. Since then, more and more studies employed a similar method [21,22].
Based on our previous studies [8,11,23], we preliminarily selected ten variables such as age, gender, and relationships with classmates significant in the univariate analysis, seven variables of which (except for the variable of "distance from home to open water", "number of sibling" and "distance from school to open water") were found to have significant associations with drowning risk in the multivariate model. Age and gender are the common drowning risk factors in children in different countries [8,11,24]. We observed a significant association between higher swimming skill and increased risk of drowning in children, which is inconsistent with one previous study from Bangladesh [25]. Swimming ability could provide protection to children in some controlled environments such as in a swimming pool [26]. However, the effectiveness of swimming ability on reducing drowning risks has not been well defined. In contrast, individuals with better swimming skills might have more water exposures and dangerous behaviors such as swimming in natural collections of water and unsupervised water, which could increase their drowning risks [27]. The above may explain the difference between the protective factor in Bangladesh and risk factor in Guangdong province, China [25,28]. We speculated that the living environment was different in Bangladesh (most households located near bodies of water) and in China (children need to spend several minutes or more to access to the bodies of water). Children with high level of swimming skills would like to play in or around water in China.
In the present study, we established a prediction model including seven independent variables based on the binary unconditional logistic regression model. According to the cutoff point, all the included children were divided into two groups. The test results in the test sample indicated that the model in predicting the risk of non-fatal drowning in students was acceptable, with an area under the ROC curve (AUC) of 0.680. Generally speaking, the AUC is around 0.7 in the population-based prediction models of a certain event [29,30], and the AUC is usually larger than 0.85 in prediction models of clinical trials [31,32].
A few limitations should be noted. The cross-sectional study has limitation in ascertaining causal relationship between the risk factors and drowning. Recall bias may exist in the cross-sectional study because most of the variables were measured by self-report, especially occurrence of drowning, the personality of the participants, the relationships with classmates and family. As one important factor associated with the risk of drowning, the information of water exposure was just simply collected in this survey. This might be the underlying reason for the higher drowning rate among the students with higher swimming skill observed in this study, this issue will be further investigated in future studies.

Conclusion
In summary, male, younger children, higher swimming skills, bad relationship with their classmates and families, introvert personality and unclear distributions of water areas on their ways to school might be important risk factors of non-fatal drowning among children, and the prediction model based on these variables has an acceptable predictive ability.

Funding
The study was one of the projects of National Injuries Intervention-child drowning prevention. This project was funded by the National Center for Chronic Non-communicable Disease Control and Prevention, which provides guidance for the design and implementation of the project. The authors are solely responsible for the content of this paper.

Availability of data and materials
Original data of this study will not be shared because it is part of a screening survey organized by the Guangdong Provincial Commission of Health and Family Planning.