Participants and study design
The present cross-sectional study was conducted during the 2016–2017 academic year in Shandong Province, China. A 3-stage cluster sampling method to recruit a regionally representative sample of adolescents from 91 public middle and high schools from 13 administrative regions including Binzhou, Dongying, Dezhou, Heze, Jining, Jinan, Laiwu, Linyi, Qingdao, Rizhao, Weifang, Weihai, and Zibo (see Fig. 1). These 13 regions were randomly selected to ensure the geographical diversity of the samples.
A total of 28,062 secondary school students aged 12–19 years (14.5 ± 4.3 years) participated in the fitness assessments and 27,955 participants (14,164 boys and 13,791 girls) were eventually included in the analysis after removal of extreme scores (107/28,062; < 1% of total sample). A total of 102 evaluators were recruited from physical education (PE) teachers working in middle and high schools who had previous experience in evaluating youth fitness and who had operated National Student Fitness Test program, which aims to promote physical activity for school children and youth in China. All evaluators were given a testing manual that had been developed by the project team and illustrated all the test guidelines, procedures and protocols. In addition, all evaluators completed two training seminars that facilitated to standardize and homogenize the method of assessment and quality control in order to reduce intra- and inter-tester errors. All elevators organized students to assess physical fitness test and guidance to answer electronic-based questionnaires.
Assessments
Trained physical education (PE) teachers administered all tests by following standard operating procedures. The training was administered through workshops. Participants completed a standardized form, which included socio-demographic data, region, school, age, gender, date of birth, grade and ethnicity. Data collection was incorporated with the annual physical fitness test through the school physical education time mandated by the Ministry of Education in China. Physical fitness was assessed by employing using the Chinese National Student Physical Fitness Standard (CNSPFS) battery [19], which contains seven tests that gauged different components of fitness. Each fitness test score was calculated by a grade- and sex-specific percentage, then categorized into “not pass”, “pass”, “good”, and “excellent”. The testing battery is a reliable and valid instrument to assess physical fitness in adolescents and is a norm testing battery in China [19]. The test-retest reliability across all assessments employed in the current study were ICC > 0.90, which was determined acceptable.
Body mass index (BMI)
BMI was selected as a surrogate assessment of body composition. Participants’ height (cm) was measured to the nearest 0.1 cm in bare feet while weight (kg) were examined to the nearest 0.1 kg by GMCS-IV; Jianmin, Beijing, China). Based on the height and weight results, BMI scores were calculated as weight in kilograms divided by squared height in meters (kg/m2). Participants from all the grade years recruited in the study (7th–12th grade) received the BMI assessment
50-m Sprint run
This sprint test was administered on a flat and clear surface where participants were instructed to run in a straight line for 50 m (m). Each participant performed the test one time as a single maximum sprint and the performance was recorded to the nearest 0.1 s. All participants were required to complete the test
Standing long jump
Participants stood behind a starting line marked on the floor with two feet together. The participants then jumped forward with maximum power and the results were measured in distance from the take off line to the nearest point of contact on the landing (back of the heels). Three attempts were allowed and the longest distance (in cm) was recorded as the official score
Sit-and-reach
The sit-and-reach test reflects flexibility level of the lower body. Participants were instructed to take a seated position with both knees fully extended and feet placed firmly against a vertical support. They were requested to reach forward with their hands as far as possible along a measuring line. Each participant performed the sit-and-reach test for two trails with the score on the farthest distance recorded (measured to the nearest 0.1 cm)
1000 M/800 m distance run
A sex-specific test of participants’ cardiorespiratory endurance, which instructed a 1000-m run and an 800-m run for boys and girls, respectively. Participants were instructed to run as fast as possible for the distance requested while being allowed to walk or stop during the test. Running performance was recorded to the nearest 0.1 s
Timed sit-ups (girls)
Timed sit-ups was selected as an assessment of abdominal muscle endurance. The test instructed participants to perform sit-ups as many times as possible for one minute. The testing staff counted the number of sit-ups during the period. The standard of a qualified sit-up was described as “to lay in a supine position with the knees bent and feet flat on the floor mat with their hands placed on the back of the head and fingers crossed”. The participants elevated their trunk until their elbows made a contact with the thighs. The participants then returned to the starting position by lowering their shoulder blades to the mat. The final score was recorded as the number of successfully completed repetitions
Pull-ups (boys)
Pull-ups were used to indicate upper body muscular endurance. Assuming an upright position, with a long jump, children grasped an overhead bar using an overhand grip with arms fully extended. Children were asked to use their arms to pull the body up until the chin cleared to the top of the bar and them lower their body again to a position with the arms extended. The final score was recorded as the number of successfully completed repetitions
Statistical analysis
Data were checked for Gaussian distributions using k-density plots. Extreme outliers were removed from the data set using a z-score cut-point of ±5.0. Differences between the sexes on all observed variables were examined using independent t-tests. Effect sizes were calculated using Cohen’s delta (d), where d < 0.20 indicating a small effect, d ≈ 0.50 a medium effect, and d ≥ 0.80 a large effect [23]. To examine the independent predictive relationships between each fitness test continuous score and SES, multi-level general linear mixed effect models were employed. Random intercepts were used at the region level. Likelihood ratio tests with chi-square statistics were employed to test if the multi-level models were statistically different from the naïve model assuming no clustering within the data structure. Analyses were conducted for the total sample and within sex groups to test for sex modifying effects. Age was entered in as a covariate within each of the models. The reporting of the results included the adjusted parameter (b-coefficient) estimates with 95% Confidence Intervals.
To examine the predictive relationship between categorical fitness test achievement (i.e., no pass, pass, good, and excellent) and SES status, multi-level ordered logistic models using STATA’s “meologit” command were employed. A likelihood ratio test and the Brant test were employed to test the proportional odds assumption of ordered logistic regression, which assumes that the computed coefficients that describe the relationships among all levels of the dependent variable (i.e., levels of achievement) are invariant. Reporting of the results included the age-adjusted odds ratios (ORs) with corresponding 95% Confidence Intervals. The reference level for fitness test achievement was the “no pass” category. Categorical analyses were run for the total sample and within sex groups to test for modifying effects. All analyses had an a priori alpha level set at p ≤ 0.05 and were carried out using STATA v15.0 statistical software package (College Station, Texas, USA).