Systematic review of the relationships between physical activity and health indicators in the early years (0-4 years)

Background Given the rapid development during the early years (0-4 years), an understanding of the health implications of physical activity is needed. The purpose of this systematic review was to examine the relationships between objectively and subjectively measured physical activity and health indicators in the early years. Methods Electronic databases were originally searched in April, 2016. Included studies needed to be peer-reviewed, written in English or French, and meet a priori study criteria. The population was apparently healthy children aged 1 month to 59.99 months/4.99 years. The intervention/exposure was objectively and subjectively measured physical activity. The comparator was various volumes, durations, frequencies, patterns, types, and intensities of physical activity. The outcomes were health indicators ranked as critical (adiposity, motor development, psychosocial health, cognitive development, fitness) and important (bone and skeletal health, cardiometabolic health, and risks/harm). The Grading of Recommendations Assessment, Development, and Evaluation (GRADE) framework was used to assess the quality of evidence for each health indicator by each study design. Results Ninety-six studies representing 71,291 unique participants from 36 countries were included. Physical activity interventions were consistently (>60% of studies) associated with improved motor and cognitive development, and psychosocial and cardiometabolic health. Across observational studies, physical activity was consistently associated with favourable motor development, fitness, and bone and skeletal health. For intensity, light- and moderate-intensity physical activity were not consistently associated with any health indicators, whereas moderate- to vigorous-intensity, vigorous-intensity, and total physical activity were consistently favourably associated with multiple health indicators. Across study designs, consistent favourable associations with health indicators were observed for a variety of types of physical activity, including active play, aerobic, dance, prone position (infants; ≤1 year), and structured/organized. Apart from ≥30 min/day of the prone position for infants, the most favourable frequency and duration of physical activity was unclear. However, more physical activity appeared better for health. Evidence ranged from “very low” to “high” quality. Conclusions Specific types of physical activity, total physical activity, and physical activity of at least moderate- to vigorous-intensity were consistently favourably associated with multiple health indicators. The majority of evidence was in preschool-aged children (3-4 years). Findings will inform evidence-based guidelines. Electronic supplementary material The online version of this article (10.1186/s12889-017-4860-0) contains supplementary material, which is available to authorized users.


Background
The health benefits of physical activity, in particular moderate-to vigorous-intensity physical activity (MVPA), have been frequently studied in school-aged children and youth (5-17 years) as well as adults (≥18 years) [1][2][3][4]. Accordingly, global recommendations on the amount of MVPA recommended for health benefits in these age groups exists [5]. In contrast, less research has focused on the health benefits of physical activity in the early years (0-4 years). Given that the early years are a critical and rapid period of physical, cognitive, social, and emotional development [6], determining the dose (e.g., frequency, intensity, time/duration, type) of physical activity needed for healthy growth and development is of great importance.
To better understand the dose of physical activity needed in the early years, in 2012 Timmons and colleagues conducted a systematic review that examined the relationship between physical activity and multiple health indicators in this age group [7]. Favourable associations between physical activity and some aspects of health, including adiposity, bone and skeletal health, motor skill development, psychosocial health, cognitive development, and cardiometabolic health, were reported [7]. However, within this review, cross-sectional studies were excluded a priori; consequently, only 22 studies were identified and limited information on the dose of physical activity required for health benefits was found [7].
The previous systematic review by Timmons and colleagues helped inform the first Canadian Physical Activity Guidelines for the Early Years [8]. Given the limited information on the dose of physical activity required for good health, guideline formation was influenced by expert opinion, international harmonization, and stakeholder input [8]. The guidelines state that for healthy growth and development, infants (<1 year) should be physically active several times daily, and toddlers (1-2 years) and preschoolers (3-4 years) should accumulate at least 180 min per day of physical activity at any intensity spread throughout the day and progress to 60 min per day of energetic play by 5 years of age [8]. These recommendations align with physical activity recommendations in Australia [9] and the United Kingdom [10].
Since the dissemination of physical activity guidelines for children of the early years in Australia, Canada, and the United Kingdom [8][9][10], a number of new studies have examined physical activity in this age group, primarily in preschool-aged children [11]. However, due to several gaps and limitations in the literature, it remains unclear whether children in the early years are sufficiently active for good health [11,12]. For example, no clear benchmark exists for the appropriate dose of physical activity in infants; limited research has been conducted with toddlers [13][14][15]; and estimates of the proportion of preschool-aged children meeting the physical activity guidelines vary considerably (27%-100%) [11]. This variation is partly due to different methodologies used across studies, and in particular different cut-points for light-intensity physical activity (LPA) [11]. Despite differences in cut-points used, most of the physical activity in preschool-aged children appears to be of low-intensity [11,16,17]. Currently, the specific frequency, intensity, duration, and type of physical activity required for good health in the early years remains unclear.
To ensure physical activity guidelines are reflective of the most up-to-date scientific knowledge, it is important to revisit and update the available evidence [18]. As studies with cross-sectional designs were excluded in the 2012 review [7] that informed the current Canadian guidelines, all available evidence was not originally captured. Causality cannot be determined with crosssectional studies. However, given the limited evidence, cross-sectional studies may help to expand the current understanding of the relationships between physical activity and health in the early years. Since the 2012 review, other systematic reviews have been completed but they have focused on specific types of physical activity (e.g., outdoor play, structured physical activity) or specific health indicators (e.g., motor development, cognitive development, psychosocial health) [19][20][21][22][23], and three of the five reviews only included preschool-aged children [20,22,23]. To our knowledge, no systematic review has been conducted that comprehensively examined the relationships between subjectively and objectively measured physical activity and a broad range of health indicators in infants, toddlers, and preschoolers across study designs. Therefore, the purpose of this systematic review was to examine the associations between objectively and subjectively measured physical activity and health indicators in the early years across all study designs. To help inform guideline updates or development, an additional purpose was to determine what dose of physical activity is associated with health indicators in children of the early years.

Protocol and registration
This systematic review was registered with the International Prospective Register of Systematic Reviews (PROSPERO; Registration no. CRD42016035937; available from: https:// www.crd.york.ac.uk/PROSPERO/display_record.php?ID= CRD42016035937). It was conducted and reported following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement for reporting systematic reviews and meta-analyses [24].

Eligibility criteria
For a study to be included in this review, it had to be peer-reviewed, published, written in English or French, and meet a priori (i.e., before database searches and screening) determined Population, Intervention, Comparison, and Outcome (PICO) study criteria [25]. Conference abstracts and grey literature were not eligible because they may not be subject to the same peer-review rigour. However, preliminary results from registered clinical trials were eligible.

Population
The population was apparently healthy (i.e., general population, including samples of overweight/obese children but not samples of children exclusively with a diagnosed medical condition) young children (mean age: 1 month-59.99 months/4.99 years). Where an age range was reported instead of a mean, samples with a lower limit of 1 month-59.99 months/4.99 years and an upper limit of <6 years were eligible for inclusion. If a mean age or age range was not reported, samples described as infants, toddlers, and/or preschoolers were included. For longitudinal or experimental study designs, the age criterion applied to at least one measurement time point of the exposure. For feasibility (i.e., staff and funding restrictions and overall project timelines) and to maximize the generalizability of findings, experimental studies were required to have a minimum sample size of 15 participants in at least one intervention group and observational studies were required to have a minimum sample size of 100 participants. Setting minimum sample size inclusion criteria a priori is consistent with a similar systematic review in school-aged children and youth [2]; however, more lenient cut-offs were chosen a priori in the present review because it was anticipated the volume of research was lower in the early years age group. Age subgroups were defined as 1.0-12.99 months (≤1.0 year) for infants, 13.0-35.99 months (1.1-2.99 years) for toddlers, and 36.0-59.99 months (3.0-4.99 years) for preschoolers.

Intervention (exposure)
The interventions were volumes, durations, frequencies, patterns, types, and intensities of physical activity. For this review, physical activity was defined as any bodily movement generated by skeletal muscles that results in energy expenditure above resting levels [26]. "Prone position" or "tummy time" in infants, and "outdoor time" in any age group, were considered eligible physical activity exposures. Total energy expenditure measured by doubly labelled water or direct/indirect calorimetry was not considered an eligible exposure because it includes resting metabolic rate and the thermic effect of food in addition to activity energy expenditure [27]; however, activity energy expenditure measured by these methods was eligible. Physical activity could be measured objectively (e.g., accelerometer, direct observation) or subjectively (e.g., proxy-report). For experimental studies, interventions had to target physical activity exclusively with no other health behaviours (e.g., physical activity and diet or physical activity and sedentary behaviour), but were not required to have reported a measured change in physical activity.

Comparison
The comparators were volumes, durations, frequencies, patterns, types, and intensities of physical activity. A comparator or control group was not required.

Outcomes (health indicators)
The outcomes were eight health indicators chosen by the review team and collaborators based on the scientific literature to reflect physical, social, and cognitive health. The review team and collaborators ranked the eight health indicators as "critical" or "important" in line with the Grading of Recommendations Assessment, Development, and Evaluation (GRADE) framework [28,29]. Critical health indicators included: adiposity (e.g., overweight, obesity, body mass index [BMI], skinfold thickness, body fat), motor development (e.g., gross motor skills, fine motor skills, locomotor and object control skills), psychosocial health (e.g., self-efficacy, self-esteem, prosocial behaviour, aggression, social functioning, depressive symptoms, anxiety symptoms, quality of life), cognitive development (e.g., language development, attention, executive functioning), and fitness (e.g., cardiovascular fitness, musculoskeletal fitness). Important health indicators included: bone and skeletal health (e.g., bone mineral density, bone mineral content, skeletal area, Vitamin D), cardiometabolic health (e.g., blood pressure, insulin resistance, blood lipids), and risks/harm (e.g., injury, plagiocephaly).

Information sources and search strategy
The search strategies for this review were developed and peer-reviewed by two librarians with expertise in systematic reviews. The following databases were searched between April 14  No date or study design limits were included (see Additional file 1 for the complete search strategies). As more than 6 months had passed since the initial full search, a partial search update was conducted in all databases on November 1, 2016, to capture any randomized controlled trials (RCTs) or clustered RCTs that included "critical" health indicators. A partial search update rather than a full update was conducted because of logistical reasons (i.e., staff and funding restrictions and overall project timelines). Furthermore, a large volume of observational studies had already been captured, so it was a priority to focus on studies with designs that have the potential to provide the highest quality of evidence to inform review findings and guideline formation.
All records retrieved from the database searches were imported into Reference Manager Software (Version 11; Thompson Reuters, San Francisco, CA, USA), and duplicate records were removed by employing a two-step strategy. Specifically, duplicates were first identified automatically in Reference Manager; one member of the review team then manually checked and removed additional duplicates where appropriate. After de-duplication, records were imported into Distiller SR Software (Evidence Partners, Ottawa, ON, Canada) for screening. First, titles and abstracts were screened by two independent reviewers; if a record was included by at least one reviewer, the record was obtained for further screening. Second, full-text articles were obtained and screened by two independent reviewers. Agreement between reviewers was required for a study to be included or excluded. Discrepancies that could not be resolved by the two independent reviewers were resolved by discussions with a third reviewer or with the review team if needed.
The reference lists of relevant reviews identified during screening were also checked to see if any additional relevant studies could be identified. To capture registered clinical trials, two trial registries (https://clinicaltrials.gov and http://www.who.int/ictrp/en/) were searched on February 1, 2017, using search terms for physical activity and the early years age group. This final search was to detect any large studies that were in progress and could potentially overturn findings. If found, this pending new evidence would have been included in the discussion.

Data extraction
Descriptive study characteristics as well as information regarding the exposure, outcome, and results were extracted in Microsoft Excel for each included study. For the results, where applicable, information was extracted from both unadjusted models and the most fully adjusted model. Furthermore, a finding was deemed to be statistically significant when p < 0.05 was reported, even if statistical significance was defined differently in a study. One reviewer completed data extraction for each study and a second reviewer checked the extracted data. A third reviewer then checked all extracted results.

Quality assessment
The quality of evidence assessment for each included study design within each health indicator was guided by the GRADE framework [30]. Quality of evidence reflects the level of confidence in the estimated effects. Detailed information on GRADE methodology can be found elsewhere [30]. Briefly, five assessment criteria (risk of bias, inconsistency, indirectness, imprecision, other [e.g., dose-response evidence]) were used to rate quality of evidence as "high", "moderate", "low", or "very low". Quality of evidence ratings started at "high" for RCTs and "low" for all other experimental and observational studies. The quality of evidence could be downgraded for any study design due to limitations associated with the five assessment criteria. The review team decided a priori that if the only identified sources of bias were selection bias due to the use of a convenience sample or performance bias due to lack of intervention/control group blinding, the quality of evidence would not be downgraded because of the risk of bias. If no limitations were identified, the quality of evidence from non-randomized and observational study designs could be upgraded if large effect sizes or evidence of a doseresponse gradient were reported. Since dose-response evidence could not be determined for cross-sectional studies, observations of a gradient of higher exposure with higher/ lower outcome were considered a reason to upgrade the quality of evidence associated with this study design [29].
Risk of bias was the only criterion out of the five assessment criteria that was first assessed at the individual study level. The Cochrane risk of bias assessment was used for experimental studies [31]. For observational studies, the risk of selection bias, performance bias, selective reporting bias, detection bias, attrition bias, and other biases (e.g., inadequate control for key confounders) was assessed [32]. For all studies, risk of bias was assessed by one reviewer and checked by two other reviewers. Overall quality of evidence was evaluated by one reviewer and verified by the larger review team, including two members with expertise in systematic review methodology.

Data analysis
Two members of the review team with experience in conducting meta-analyses assessed the data for each health indicator to determine if any of the data was sufficiently homogenous with regard to statistical, clinical, and methodological characteristics for meta-analyses. Due to high levels of heterogeneity in study design and measured outcomes, only one meta-analysis was possible for four studies that included adiposity as a health indicator [33][34][35][36]. Change (post-intervention minus baseline) values from studies were entered into Review Manager Software 5.3 (The Cochrane Collaboration, Copenhagen, Denmark). When necessary, standard deviations of change were calculated based on other available statistics in accordance with the Cochrane Handbook for Systematic Reviews of Interventions [31]. Additionally, one study [37] only presented results by sex-specific subgroups, and thus had to be entered into the meta-analysis accordingly. Based on the subjectively assessed heterogeneity of the interventions, random-effects models were used to calculate the weighted mean difference according to the DerSimonian and Laird method [38,39]. Due to the small number of studies included in the meta-analysis, sensitivity analyses and/or sub-group analyses were not possible.
A narrative synthesis was also conducted for all included studies. Results were first synthesized by health indicator and study design then further synthesized by intensity or type of physical activity. For fitness and cardiometabolic health, results were also synthesized by different dimensions of the indicator (i.e., cardiorespiratory fitness and other fitness measures; blood pressure, cholesterol, and triglycerides). Finally, a sub-group analysis was conducted to examine frequency and duration of physical activity. Since not all studies reported on frequency and duration, data were synthesized across health indicators but examined separately for experimental and observational study designs. For observational study designs, frequency and duration data were also synthesized for intensity and type of physical activity. When multiple associations were examined (e.g., physical activity and BMI and physical activity and waist circumference or sex-stratified analyses between physical activity and BMI), a study was classified in one of four mutually exclusive groups: 1) "favourable" if at least one favourable but no unfavourable associations were observed, 2) "unfavourable" if at least one unfavourable but no favourable associations were observed, 3) "null" if no favourable or unfavourable associations were observed, and 4) "mixed" if both favourable and unfavourable or favourable, unfavourable and null associations were all observed. Within the narrative analysis, all studies were weighted equally. Finally, unless otherwise stated, findings are based on samples classified as preschool-aged children.

Description of studies
After de-duplication, 20,848 titles and abstracts and 915 full-text articles were screened (see Fig. 1). It was determined that 96 studies (87 unique samples) met the inclusion criteria. Of the 96 studies, 4 were identified through the MEDLINE update of the full search. No additional studies were identified in the partial update search or the trial registry searches. Reasons for excluding full-text articles included: not original research (e.g., review; n = 116), non-English language or non-French language (n = 4), ineligible age (n = 321), special population (n = 19), no measure of physical activity (n = 155), no measure of a health indicator of interest/did not assess the association between physical activity and health indicator of interest (n = 98), sample size (n = 48), intervention did not exclusively target physical activity (n = 45), and other (e.g., physical activity was a covariate, not human participants; n = 13). Some full-text articles were excluded for multiple reasons. Additionally, nine full-text articles could not be located so these records were excluded.
Physical activity was measured objectively in 38 studies, primarily by accelerometers; 10 studies used direct observation, heart rate monitors, pedometers, and/or doubly labelled water (i.e., activity energy expenditure). Physical activity was measured subjectively in 48 studies by proxy-report questionnaire, log, or interview. Five studies used both objective and subjective measures of physical activity. For 15 studies that included physical activity interventions, physical activity was not measured but these studies were included in the review because the intervention targeted physical activity exclusively. The types of physical activity included in both observational and experimental studies were: active play, active transportation, aerobic, biking, dance, home-based, exercise play, indoor, leisure, outdoor, passive cycling, prone position, roughand-tumble play, sport, structured/organized, walking, and weight bearing. Further information on the study design, sample, exposure, outcome, and main findings for all individual studies are summarized in Tables S1 to S8 in Additional file 2. It should be noted that the number of studies summed across study designs and across health indicators is more than 96 because 20 studies included more than one health indicator of interest, and five studies presented both longitudinal and cross-sectional findings.
In the RCT, the mean sum of four skinfolds was significantly lower in the intervention group whose parents received physical activity recommendations from a nurse when their child was an infant, compared to the control group, who did not receive recommendations [40]. However, no significant group differences were observed for percentage overweight, waist or hip circumference, or body fat percentage. Furthermore, physical activity did not significantly differ between the intervention and the control group [40]. The quality of evidence was downgraded from "high" to "low" because of very serious indirectness (see Table 1).
For the four clustered RCTs, a significant decrease in BMI was observed in the intervention group (structured/organized physical activity plus cognitive-behavioural training and resources) compared to the control group (structured/ organized physical activity) in one study [34]. However, no significant differences in adiposity were observed between the intervention (structured/organized physical activity or aerobic physical activity or government-led physical activity program) and the control groups (standard care) in the other three studies [33,35,41]. Furthermore, physical activity did not significantly differ between the intervention and the control groups in one study [41]. The quality of evidence was downgraded from "high" to "low" because of a serious risk of bias and serious indirectness (see Table 1).
For the two non-randomized interventions, there were no significant differences in adiposity between intervention (structured/organized physical activity) and control (standard care) groups in one study [36] or from baseline to follow-up in another study (structured/organized physical activity) [42]. The quality of evidence was downgraded from "low" to "very low" because of a serious risk of bias (see Table 1).
Among the seven longitudinal studies, physical activity was favourably associated with adiposity for at least one association in three studies [43][44][45] and not associated with adiposity in three studies [46][47][48]; mixed findings were observed in one study [49]. For two of the studies that found some favourable associations, a number of null associations were also observed [44,50]. One study with favourable findings had an infant sample [45]. In regard to intensity or type of physical activity, at least one favourable association was observed between each of the following physical activity exposures and adiposity: total physical activity (TPA; 2/4 studies), MVPA (1/1 study), and aerobic physical activity (1/1 study). However, primarily null or mixed associations were observed between each of the following physical activity exposures and adiposity: vigorous-intensity physical activity (VPA), activity energy expenditure, home-based physical activity, leisure physical activity, and structured/organized physical activity (see Table 1). The quality of evidence was downgraded from "low" to "very low" because of a serious risk of bias (see Table 1).  percentiles), waist circumference (absolute, percentile), hip circumference, waist-to-hip ratio, waist circumference z-score (Netherlands reference data), waist circumference-for-age z-score, sum of skinfolds, triceps skinfold thickness, body fat % (bioelectrical impedance, dual-energy X-ray absorptiometry), fat mass index (dual energy X-ray absorptiometry, air-displacement plethysmography), fat free mass index (dual energy X-ray absorptiometry, air-displacement plethysmography), fat mass (dual energy X-ray absorptiometry, air-displacement plethysmography), fat free mass (dual energy X-ray absorptiometry), % fat mass, trunk fat mass index, lean mass index (dual-energy X-ray absorptiometry), and subjectively by weight status (CDC ≥85th percentile). In 2 studies, it was unclear whether weight status (CDC ≥85th percentile) or BMI was measured objectively or subjectively.

2441
TPA was favourably associated with adiposity (change in weight-for-height z-score but not waist circumference-for-age z-score in 1 study) in 2 studies [43,45] and not associated with adiposity in 2 studies [46,49].
MVPA was favourably associated with adiposity (fat free mass but not BMI, fat mass, or % fat mass in 1 study) in 1 study [49].
VPA was not associated with adiposity in 1 study [48]. Activity energy expenditure was favourably (fat free mass), unfavourably (BMI, fat mass), and not (% fat mass) associated with adiposity in 1 study [49].
Aerobic PA was favourably associated with adiposity (baseline PA only not change in PA) in 1 study [44].
Home-based PA was not associated with adiposity in 1 study [47].
Leisure PA was not associated with adiposity in 1 study [44]. TPA was not associated with adiposity in 1 study [51].
MPA was not associated with adiposity in 1 study [52].
VPA was not associated with adiposity in 1 study [52].
Outdoor PA was favourably associated with adiposity in 1 study [51] and not associated with adiposity in 1 study [53]. LPA was favourably associated with adiposity (waist circumference z-score but not BMI z-score) in 1 study [50], unfavourably associated with adiposity (% body fat and fat mass index but not trunk fat mass index and lean mass index) in 1 study [89], and not associated with adiposity in 6 studies [55,67,76,84,86,87]. LPA 5-min bouts were not associated with adiposity in 1 study [86].
MPA was unfavourably associated with adiposity in 1 study [50] and not associated with adiposity in 2 studies [55,89].
MVPA was favourably associated with adiposity (% fat mass but not BMI, fat free mass, fat mass in 1 study; boys only in 1 study; % body fat and fat mass index but not trunk fat mass index or lean mass index in 1 study; % fat mass and fat free mass index but not BMI, fat mass index, or waist circumference in 1 study; girls only and waist circumference at the 90th percentile but not the 10th, 25th, 75th percentiles or BMI z-score or waist circumference in 1 study) in 6 studies [49,54,55,60,88,89], unfavourably associated with adiposity (boys only and BMI z-score but not waist circumference in 1 study) in 3 studies [67,69,88], and not associated with adiposity in 8 studies [65,76,77,82,[84][85][86][87].
VPA was favourably associated with adiposity (boys only in 1 study; % body fat, fat mass index, trunk fat mass index but not lean mass index in 1 study; fat free mass index but not BMI, fat mass, fat mass index, and waist circumference in 1 study) in 4 studies [54,55,60,89], unfavourably associated with adiposity in 1 study [50], and not associated with adiposity in 3 studies [67,74,82]. Activity energy expenditure was favourably (fat free mass), unfavourably (BMI), and not (fat mass, % fat mass) associated with adiposity in 1 study [49].
Indoor PA was not associated with adiposity in 1 study [81]. Leisure PA was favourably associated with adiposity (intermediate vs. none but not high vs. none) in 1 study [59]. Outdoor PA was favourably associated with adiposity in 1 study [58] and not associated with adiposity in 8 studies [61,73,75,[78][79][80][81]83] Organized Sport was unfavourably associated with adiposity (girls only) in 1 study [68]. Structured/organized PA was favourably associated with adiposity in 1 study [57]. Active play was favourably associated with adiposity (weekdays only in 1 study) in 2 studies [62,65] and not associated with adiposity in 1 study [71]. Active transportation was not associated with adiposity in 1 study [70]. The intervention did not result in a significant change in physical activity [40] c Quality of evidence was downgraded from "high" to "low" because of very serious indirectness d Includes 4 clustered RCTs [33][34][35]41] e Unclear whether outcome assessors were blinded to group allocation and unclear if the outcome was objectively measured in 1 study [34]. Large amount of missing data primarily because mean attendance at child care was 48% and it is unknown if the reason for poor attendance was related to adiposity in 1 study [41]. Physical activity was not measured so it is unknown if the intervention resulted in a significant change in physical activity in 1 study [35] f The intervention did not result in a significant change in physical activity in 1 study [41] g Quality of evidence was downgraded from "high" to "low" because of serious risk of bias and serious indirectness h Includes 2 non-randomized interventions [36,42] i No control group in 1 study [42]. Physical activity was not measured so it is unknown if the intervention resulted in a significant change in physical activity in 2 studies [36,42] j Quality of evidence was downgraded from "low" to "very low" because of serious risk of bias k Includes 7 longitudinal studies [43][44][45][46][47][48][49] l Convenience sample was used in 1 study [44]. Psychometric properties unknown for the subjective physical activity measures in 3 studies [44,45,47]. Large unexplained loss to follow-up and incomplete data in 1 study [45]. No potential confounders were adjusted for in 2 studies [43,45]. Potentially inappropriate statistical analysis: one study mutually adjusted for other movement behaviours in the fully adjusted models [49] m A dose-response gradient of higher aerobic PA and MVPA with better adiposity was observed in 2 studies [44,49]. A dose-response gradient of higher activity energy expenditure was associated with both better and worse adiposity depending on the adiposity measure in 1 study [49] n Quality of evidence was downgraded from "low" to "very low" because of serious risk of bias; because of this limitation, was not upgraded for a dose-response gradient o Includes 3 case-control studies [51][52][53] p Psychometric properties unknown for the subjective physical activity measures in 3 studies [51][52][53] q Quality of evidence was downgraded from "low" to "very low" because of serious risk of bias Favourable and unfavourable associations between physical activity and adiposity observed across studies u A gradient for higher TPA, MVPA, VPA activity energy expenditure, outdoor PA, and physical education with better adiposity was observed in 6 studies [49,55,57,58,88,89]. A gradient for higher activity energy expenditure and LPA, MVPA with worse adiposity was observed in 3 studies [49,88,89] v Quality of evidence was downgraded from "low" to "very low" because of serious risk of bias and serious inconsistency; because of this limitation, was not upgraded for an exposure/outcome gradient For the three case-control studies, physical activity was favourably associated with adiposity in one study [51] and not associated with adiposity in two studies [52,53]. One study with null findings had an infant and toddler sample [53]. In terms of the intensity or type of physical activity, at least one favourable association was observed between outdoor physical activity and adiposity (1/2 studies). However, primarily null associations were observed between each of the following physical activity exposures and adiposity: TPA, moderate-intensity physical activity (MPA), and VPA (see Table 1). The quality of evidence was downgraded from "low" to "very low" because of a serious risk of bias (see Table 1).

Motor development
The association between physical activity and motor development was examined in 23 studies (21 unique samples; see Table 2 and Table S2 in Additional file 2). Among the four RCTs, significant increases in motor development were observed in the intervention groups (planned passive cycling or structured/organized physical activity) compared to the control groups (standard care) in three studies [90][91][92]. One intervention involved an infant sample [91]. In the fourth study, no significant differences in motor development were observed between the intervention (parents received physical activity recommendations from a nurse when their child was an infant) and control (no recommendations) groups [40]; however, physical activity was not significantly different between groups [40]. The quality of evidence was downgraded from "high" to "low" because of a serious risk of bias and serious indirectness (see Table 2).
In the two clustered RCTs, greater increases in total motor development and jumping were observed in the intervention group (structured/organized physical activity) compared to the control group (standard care) in one study; however, no such increases were seen in running, hopping, catching, or kicking [33]. In the second study, no significant difference was observed in motor skills between the intervention (government-led physical activity program) and control (standard care) groups [41]. However, physical activity was also not significantly different between groups [41]. The quality of evidence was downgraded from "high" to "low" because of a serious risk of bias and serious indirectness (see Table 2).
Among the six non-randomized interventions, significant increases in at least one measure of motor development were observed in the intervention group (free play and structured activities, structured/organized physical activity, dance program, or swimming) compared to the control group (usual care) in five studies [36,[93][94][95][96], and significant increases from baseline to follow-up in the 12-m run and standing long jump were observed in one study (structured/organized physical activity) [42]. However, for two of the interventions, more null than  The PA intervention (structured/organized PA) was favourably associated with improved motor development (total score and jumping individual score but not for running, hopping, catching, and kicking) in 1 study [33].
The PA intervention (government-led PA program) was not associated with motor development in 1 study [41].  Indoor PA was favourably associated with motor development (throwing at target only) in 1 study [81].
Outdoor PA was not associated with motor development in 1 study [81].
Prone position was favourably associated with motor development (gross motor development but not fine motor development in 1 study) in 3 studies [97][98][99].
LPA: light-intensity physical activity; MVPA: moderate-to vigorous-intensity physical activity; PA: physical activity; RCT: randomized controlled trial; TPA: total physical activity; VPA: vigorous-intensity physical activity a Includes 4 RCTs [40,[90][91][92] b No intention-to-treat analysis; parent-child dyads were excluded if they did not carry out the management plan or if they became sick during the study; and the physical activity program was interrupted in 1 study [90]. Physical activity was not measured, so it is unknown if the intervention resulted in a significant change in physical activity in 3 studies [90][91][92] c The intervention did not result in a significant change in physical activity in 1 study [40] d Quality of evidence was downgraded from "high" to "low" because of serious risk of bias and serious indirectness e Includes 2 clustered RCTs [33,41] f Large amount of missing data primarily because mean attendance at child care was 48%, and it is unknown if the reason for poor attendance was related to the motor development in 1 study [41] g The intervention did not result in a significant change in physical activity in 1 study [41] h Quality of evidence was downgraded from "high" to "low" because of serious risk of bias and serious indirectness i Includes 6 non-randomized interventions [36,42,[93][94][95][96] j The outcome was measured post-intervention only in 2 studies [93,96]. No control group in 1 study [42]. Physical activity was not measured so it is unknown if the intervention resulted in a significant change in physical activity in 6 studies [36,42,[93][94][95][96] k Quality of evidence was downgraded from "low" to "very low" because of serious risk of bias "low" to "very low" because of serious risk of bias; because of this limitation, was not upgraded for an exposure/outcome gradient favourable effects were observed with the different motor development measures [94,96]. One intervention had an infant sample at baseline [96]. The quality of evidence was downgraded from "low" to "very low" because of a serious risk of bias (see Table 2).
In the longitudinal study, higher duration of prone positioning at 4 months of age was favourably associated with the earlier achievement of several developmental milestones and gross motor development at 6 months but not at 24 months of age [97]. However, no significant differences were observed in fine motor development [97]. In separate analyses, no significant differences in motor development at 6 and 24 months of age were observed between infants who had, versus had not, experienced prone position at 4 months of age [97]. Apart from "crawled on abdomen", significant differences for achievement of developmental milestones were also not observed between groups [97]. In further analyses comparing infants that preferred prone position at 6 months of age to those that did not, no significant differences were observed in gross and fine motor development at 24 months of age; however, the prone-preference group achieved several developmental milestones significantly earlier [97]. The quality of evidence was downgraded from "low" to "very low" because of a serious risk of bias (see Table 2). Among the 10 cross-sectional studies, physical activity was favourably associated with at least one measure of motor development in seven studies [56,67,69,[97][98][99][100], unfavourably associated with motor development in one study [101], and not associated with motor development in one study [86]; mixed findings were observed in one study [81]. Three of the studies with favourable associations [97][98][99] and one study with unfavourable associations had infant samples [101]. One study with null findings had a toddler sample [86]. For the intensity or type of physical activity, at least one favourable association was observed between each of the following physical activity exposures and motor development: MVPA (3/4 studies), VPA (1/1 study), indoor physical activity (1/1 study), and prone position (3/3 studies). However, primarily null or mixed findings were observed between each of the following physical activity exposures and motor development: TPA, LPA, LPA bouts, MVPA bouts, and outdoor physical activity (see Table 1). The quality of evidence was downgraded from "low" to "very low" because of a serious risk of bias (see Table 2).

Psychosocial health
The association between physical activity and psychosocial health was examined in 11 studies (9 unique samples; see Table 3 and Table S3 in Additional file 2). Among the two RCTs, greater increases in psychosocial health were observed in the intervention groups (planned passive cycling or dance program) compared to the control groups (standard care) [90,102]. One of the interventions had an infant sample [90]. The quality of evidence was downgraded from "high" to "moderate" because of a serious risk of bias (see Table 3).
In the clustered RCT, no significant differences in quality of life were observed between the intervention (government-led physical activity program) and control (standard care) groups [41]. Physical activity was also not significantly different between groups [41]. The quality of evidence was downgraded from "high" to "very low" because of a serious risk of bias and very serious indirectness (see Table 3).
Among the two longitudinal studies, sport participation was favourably associated with psychosocial health in one study [103], and TPA was favourably associated with psychosocial health in one study [104] but not the other [103]. The quality of evidence was downgraded from "low" to "very low" because of a serious risk of bias (see Table 3).
Among the six cross-sectional studies, physical activity was favourably associated with at least one measure of psychosocial health in one study [105], unfavourably associated with at least one measure of psychosocial health in three studies [101,106,107], and not associated with psychosocial health in two studies [108,109]. However, primarily null associations were observed in all studies. One study with unfavourable associations had an infant sample [101]. In regard to intensity or type of physical activity, at least one favourable association was observed between MVPA and psychosocial health (1/2 studies), and at least one unfavourable association was observed between bike riding and psychosocial health (2/2 studies). However, primarily null or mixed findings were observed between each of the following physical activity exposures and psychosocial health: TPA, exercise play, rough-and-tumble play, and walking (see Table 3). The quality of evidence was downgraded from "low" to "very low" because of a serious risk of bias and serious inconsistency (see Table 3).

Cognitive development
The association between physical activity and cognitive development was examined in 13 studies (13 unique samples; see Table 4 and Table S4 in Additional file 2). Among the two RCTs, significant increases in cognitive development were observed in the intervention groups (planned passive cycling or structured/organized physical activity) compared to the control groups (standard care) [90,91]. One intervention involved an infant sample [90]. The quality of evidence was downgraded from "high" to "moderate" because of a serious risk of bias (Table 4).
For the clustered RCT, significant increases in free and/or cued recalls of previously learned Italian words were observed in the physical activity intervention TPA was favourably associated with psychosocial health (active vs. less active but not active vs. average) in 1 study [104] and not associated with psychosocial health in 1 study [103]. Sport participation was favourably associated with psychosocial health (high risk and recovery trajectories but not the rebound trajectory) in 1 study [103]. TPA was unfavourably associated with psychosocial health in 1 study [101] and not associated with psychosocial health in 1 study [109].
MVPA was unfavourably associated with psychosocial health in 1 study [107] and not associated with psychosocial health in 1 study [109].
Bike riding was unfavourably associated with psychosocial health (for boys only on weekdays only in 1 study) in 2 studies [106,107].
Exercise play was favourably associated with psychosocial health (mixed gender [not non-mediated] and same gender but not other gender groups) in 1 study [105], unfavourably associated with psychosocial health (boys only, weekend only, and only for > 2 and ≤ 24 h group) in 1 study [106], and not associated with psychosocial health in 1 study [107].
Routh-and-tumble play was not associated with psychosocial health in 2 studies [105,108].
Walking was not associated with psychosocial health in 2 studies [106,107].

VERY LOW o
MVPA: moderate-to vigorous-intensity physical activity; PA: physical activity; RCT: randomized controlled trial; TPA: total physical activity a Includes 2 RCTs [90,102] b No intention-to-treat analysis; parent-child dyads were excluded if they did not carry out the management plan or if they became sick during the study and the physical activity program was interrupted in 1 study [90]. Physical activity was not measured, so it is unknown if the intervention significantly changed physical activity in 2 studies [90,102] c Quality of evidence was downgraded from "high" to "moderate" because of serious risk of bias d Includes 1 clustered RCT [41] e Large amount of missing data primarily because mean attendance at child care was 48%, and it is unknown if hte reason for poor attendance was related to psychosocial health f The intervention did not result in a significant change in physical activity g Quality of evidence was downgraded from "high" to "very low" because of serious risk of bias and very serious indirectness h Includes 2 longitudinal studies [103,104] i No psychometric properties reported for the subjective physical activity measures in 2 studies [103,104] j A significant trend was observed for poor quality of life when moving from the active to less active groups in 1 study [104] k Quality of evidence was downgraded from "low" to "very low" because of serious risk of bias; because of this limitation, was not upgraded for a dose-response gradient l Includes 6 cross-sectional studies [101,[105][106][107][108][109] m Convenience sample was used in 5 studies [101,[105][106][107][108]. Physical activity was measured only during child care in 1 study [109]. Potential confounders were not adjusted for in 3 adjusted studies [101,107,109].
No psychometric properties reported for the subjective physical activity measures in 1 study [101]. No psychometric properties reported for the outcome measure in 2 studies [101,105]. Large amount of missing data in 1 study [106] n Favourable and unfavourable associations between physical activity and psychosocial health observed across studies o Quality of evidence was downgraded from "low" to "very low" because of serious risk of bias and serious inconsistency The PA intervention (physical exercises to enact meanings of words) was favourably associated with improved cognitive development [110].
The PA intervention (physical exercises unrelated to words) was favourably associated with improved cognitive development (cued recall of words but not free recall of words) [110]. TPA was unfavourably associated with cognitive development in 1 study [101] and not associated with cognitive development in 1 study [109].
MVPA was not associated with cognitive development in 1 study [109]. Outdoor PA (at child care) was not associated with cognitive development in 1 study [58].

VERY LOW m
MVPA: moderate-to vigorous-intensity physical activity; PA: physical activity; RCT: randomized controlled trial; TPA: total physical activity a Includes 2 RCTs [90,91] b No intention-to-treat analysis; parent-child dyads were excluded if they did not carry out the management plan or if they became sick during the study and the physical activity program was interrupted in 1 study [90]. Physical activity was not measured, so it is unknown if the intervention significantly changed physical activity in 2 studies [90,91] c Quality of evidence was downgraded from "high" to "moderate" because of serious risk of bias d Includes 1 clustered RCT [110] e Includes 4 non-randomized interventions [93,[111][112][113] f Physical activity was not measured, so it is unknown if the intervention significantly changed physical activity in in 2 studies [93,113] g Quality of evidence was downgraded from "low" to "very low" because of serious risk of bias h Includes 3 cross-over trials [114][115][116] i Condition was not randomly assigned in 1 study [116]. Physical activity was not measured, so it is unknown if there were significant differences in physical activity between conditions in 2 studies [114,116]. Unclear what conditions had significant differences in the outcome measure in 1 study [116] j Quality of evidence was downgraded from "low" to "very low" because of serious risk of bias k Includes 3 cross-sectional studies [58,101,109] l Convenience sample was used in 1 study [101]. Physical activity was measured only during child care in 2 studies [58,109]. No potential confounders were adjusted for in 2 adjusted studies [101,109]. No psychometric properties reported for the subjective physical activity measure or the outcome measure in 1 study [101] m Quality of evidence was downgraded from "low" to "very low" because of serious risk of bias groups (physical activity to enact meaning of words and physical activity unrelated to words) compared to the control groups (no physical activity) [110]. The quality of evidence remained at "high" (Table 4). Among the four non-randomized interventions, a significant increase in at least one measure of cognitive development was observed in the intervention groups that participated in the intervention (academic lessons, free play, and structured activities) compared to the control groups (standard care) in three studies [93,111,112], and significant increases in children's creativity at follow-up compared to baseline were reported in one study [113]. The quality of evidence was downgraded from "low" to "very low" because of a serious risk of bias (Table 4).
Among the three cross-over trials, at least one measure of cognitive development was significantly higher in the physical activity condition (MVPA breaks, structured/organized physical activity) compared to the control condition (typical instruction, sedentary session) in two studies [114,115], and attention was significantly higher after 10-, 20-, and 30-min outdoor recess conditions in one study [116]. The quality of evidence was downgraded from "low" to "very low" because of a serious risk of bias (Table 4).
Among the three cross-sectional studies, physical activity was unfavourably associated with cognitive development in one study [101] and not associated with cognitive development in two studies [58,109]. The study with unfavourable associations had a sample of infants [101]. In regard to intensity or type of physical activity, at least one favourable association was observed between TPA and cognitive development (1/2 studies). However, MVPA and outdoor physical activity were not associated with cognitive development (see Table 4). The quality of evidence was downgraded from "low" to "very low" because of a serious risk of bias (see Table 4).

Fitness
The association between physical activity and fitness was examined in three studies (three unique samples; see Table 5 and Table S5 in Additional file 2). In the longitudinal study, TPA was favourably associated with cardiorespiratory fitness [43]. The quality of evidence was downgraded from "low" to "very low" because of a serious risk of bias (see Table 5).
Among the two cross-sectional studies, physical activity was favourably associated with at least one measure of fitness in both studies [55,117]. As for physical activity intensity or type, at least one favourable association was observed between each of the following physical activity exposures and cardiorespiratory fitness: TPA (2/2 studies), MVPA (1/1 study), and VPA (1/1 study). Similarly, at least one favourable association was observed between each of the following physical activity exposures and muscular fitness and speed-agility: TPA (1/1 study), MVPA (1/1 study), and VPA (1/1 study). However, null LPA and MPA were not associated with cardiorespiratory fitness, muscular fitness or speed-agility (See Table 5). The quality of evidence was downgraded from "low" to "very low" because of a serious risk of bias (see Table 5).

Bone and skeletal health
The association between physical activity and bone and skeletal health was examined in seven studies (seven unique samples; see Table 6 and Table S6 in Additional file 2). For the RCT, total bone mineral content in a baseline sample of infants was not significantly different between the intervention (structured/organized physical activity) and control (fine motor activity) groups [118]. However, physical activity was also not significantly different between groups. The quality of evidence was downgraded from "high" to "low" because of very serious indirectness (see Table 6).
Among the six cross-sectional studies, favourable associations were reported between physical activity and at least one measure of bone and skeletal health in five studies [119][120][121][122][123], and null associations were reported in one study [124]. One study with favourable associations had a sample of infants [119]. In regard to intensity or type of physical activity, at least one favourable association was observed between each of the following physical activity exposures and bone and skeletal health: TPA (2/3 studies), MPA (1/2 studies), MVPA (2/3 studies), VPA (2/2 studies), leisure time physical activity (1/1 study), outdoor physical activity (3/3 studies), and weight-bearing physical activity (1/1 study). Conversely, LPA was not associated with bone and skeletal health (see Table 6). The quality of evidence was downgraded from "low" to "very low" because of a serious risk of bias (see Table 6).

Cardiometabolic health
The association between physical activity and cardiometabolic health was examined in nine studies (eight unique samples; see Table 7 and Table S7 in Additional file 2). In the non-randomized intervention, children in the intervention group (structured/organized physical activity) had significantly lower diastolic blood pressure than the controls (standard care) [125]. The quality of evidence was downgraded from "low" to "very low" because of a serious risk of bias (see Table 7).
Among the two longitudinal studies, physical activity was not associated with any measure of blood pressure, cholesterol, or triglycerides in one study [43], and mixed findings with blood pressure were observed in the other study, though primarily null associations were observed [126]. In regard to intensity or type of physical activity, at least one favourable association was observed between aerobic physical activity and blood pressure (1/1 study), and at least one unfavourable association was observed TPA was favourably associated with fitness (only for 95th, 90th, 75th but not 50th and 25th percentiles of vector magnitude in 1 study) in 2 studies [55,117].
LPA was not associated with fitness in 1 study [55].
MPA was not associated with fitness in 1 study [55].
VPA was favourably associated with fitness in 1 study [55].
Other fitness measures TPA was favourably associated with muscular fitness and speed-agility (only for 95th, 90th, 75th but not 50th and 25th percentiles of vector magnitude and not for standing long jump at the 75th percentile) in 1 study [55].
LPA was not associated with muscular fitness and speed-agility in 1 study [55].
MPA was not associated with muscular fitness and speed-agility in 1 study [55].
MVPA was favourably associated with muscular fitness (standing long jump but not handgrip strength) and speed-agility in 1 study [55].
VPA was favourably associated with muscular fitness and speed-agility in 1 study [55].

VERY LOW g
LPA: light-intensity physical activity; MPA: moderate-intensity physical activity; MVPA: moderate-to vigorous-intensity physical activity; TPA: total physical activity; VPA: vigorous-intensity physical activity a Includes 1 longitudinal study [43] b The findings that were reported did not adjust for any potential confounders c Quality of evidence was downgraded from "low" to "very low" because of serious risk of bias d Includes 2 cross-sectional studies [55,117] e No potential confounders were adjusted for; a convenience sample was used and it is unclear if the fitness measure is suitable for this age group in 1 study [117]. Potentially inappropriate statistical analysis: other movement behaviours were mutually adjusted for in the fully adjusted models in 1 study [55] f A gradient for higher TPA, MVPA, VPA with higher fitness was observed in 1 study [55] g Quality of evidence was downgraded from "low" to "very low" because of serious risk of bias; because of this limitation, was not upgraded for an exposure/outcome gradient Mean baseline age ranged from 9.27-57.12 months. One study reported the baseline age as 6 months but a mean was not given. Data were collected by RCT and cross-sectional study designs. Several bone and skeletal health measures were assessed by X-ray absorptiometry including: total bone mineral content, bone mineral density of the lumbar spine (L2-L4), total body bone area, periosteal circumference of tibia, endosteal circumference of tibia, cortical bone area of tibia, hip bone area, hip bone mineral content, areal bone mineral density, and estimated volumetric bone mineral density. Bone and skeletal health was also assessed by vitamin D (25-(OH)-vitamin D3 measured in serum), vitamin D (25-(OH)-vitamin D3 parathyroid hormone in non-fasting venous blood samples), and bone stiffness (quantitative ultrasound). All outcomes were objectively measured.

14,774
TPA was favourably associated with bone and skeletal health in 2 studies [119,123] and not associated with bone and skeletal health in 1 study [124].
LPA was not associated with bone and skeletal health in 1 study [123].
MPA was favourably associated with bone and skeletal health in 1 study [123] and not associated with bone and skeletal health in 1 study [124].
MVPA was favourably associated with bone and skeletal health in 2 studies [122,123] and not associated with bone and skeletal health in 1 study [124].
VPA was not associated with bone and skeletal health in 2 studies [123,124]. Leisure time physical activity was favourably associated with bone and skeletal health in 1 study [123].
Weight-bearing activity was favourably associated with bone and skeletal health in 1 study [123]. The intervention did not significantly change physical activity c Quality of evidence was downgraded from "high" to "low" because of very serious indirectness d Includes 6 cross-sectional studies [119][120][121][122][123][124] e Potential confounders were not adjusted for in 2 studies [120,121]. Potentially inappropriate statistical analysis: other movement behaviours were mutually adjusted for in the fully adjusted models in 1 study [123]. No psychometric properties were reported for the subjective physical activity measure in 4 studies [119][120][121]123]. A convenience sample was used in 2 studies [120,124] f A gradient for higher TPA, MPA, MVPA, leisure time physical activity, outdoor activity, and weight-bearing physical activity with better bone and skeletal health was observed in 2 studies [119,123] g Quality of evidence was downgraded from "low" to "very low" because of serious risk of bias; because of this limitation, was not upgraded for an exposure/outcome gradient Mean baseline age ranged from 3-4.9 years. One study reported only that the children were preschool age. Data were collected by non-randomized intervention, longitudinal with up to 2 years follow-up, and cross-sectional study designs. Cardiometabolic health was assessed by mean arterial pressure, DBP, SBP, total cholesterol, total serum cholesterol, HDL, triglycerides, HDL 2 , LDL, LDL/HDL, total serum cholesterol/HDL, HDL/total triglycerides, and clustered cardiovascular risk score (SBP, triglycerides, total cholesterol/HDL, HOMA-IR, sum of two skinfolds). All outcomes were objectively measured.

BP
Aerobic PA was favourably associated with BP (SBP but not DBP, boys only, 2-year follow-up but not 1-year follow-up) in 1 study [126]. Leisure PA was unfavourably associated with BP (DBP but not SBP, boys only, 1-year follow-up but not 2-year follow-up) in 1 study [126]. Structured PA was not associated with BP (SBP or DBP) in 1 study [126]. Cholesterol TPA was not associated with cholesterol (total serum cholesterol, HDL, HDL 2 , LDL, LDL/HDL, or total serum cholesterol/HDL) in 1 study [43]. Triglycerides TPA was not associated with triglycerides in 1 study [43]. TPA was favourably associated with clustered risk score (boys only, Quartile 1 vs. Quartile 5 only) in 1 study [127].
MPA was not associated with clustered risk score in 1 study [127].
MVPA was not associated with clustered risk score in 1 study [127].
Aerobic PA was not associated with BP (SBP or DBP) in 1 study [126].
Indoor PA was not associated with BP (SBP or DBP) in 1 study [81].
Leisure PA was not associated with BP (SBP or DBP) in 1 study [126].
Outdoor PA was not associated with BP (SBP or DBP) in 1 study [81]. Structured PA was not associated with BP (SBP or DBP) in 1 study [126]. Cholesterol TPA was favourably associated with cholesterol (total cholesterol but not HDL) in 1 study [81] and not associated with cholesterol (total cholesterol, HDL, or HDL/total cholesterol) in 1 study [72].
Indoor PA was not associated with cholesterol (total cholesterol or HDL) in 1 study [81]. Outdoor PA was unfavourably associated with cholesterol (HDL but not total cholesterol) in 1 study [81]. Triglycerides TPA was not associated with cholesterol (total cholesterol, HDL, or HDL/total cholesterol) in 1 study [72].

VERY LOW k
BP: blood pressure; DBP: diastolic blood pressure; HDL: high-density lipoprotein cholesterol; HOMA-IR: homeostatic model assessment -insulin resistance; LDL: low-density lipoprotein cholesterol; MPA: moderateintensity physical activity; MVPA: moderate-to vigorous-intensity physical activity; PA: physical activity; SBP: systolic blood pressure; TPA: total physical activity; VPA: vigorous intensity physical activity a Includes 1 non-randomized intervention [125] b No intention-to-treat analysis; results are based on children who were measured at all 3 time points. Physical activity was not measured, so it is unknown if the intervention significantly changed physical activity c Quality of evidence was downgraded from "low" to "very low" because of serious risk of bias d Includes 2 longitudinal studies [43,126] e Potential confounders were not adjusted for in 1 study [43]. No psychometric properties were reported for the subjective physical activity measure in 1 study [126] f Quality of evidence was downgraded from "low" to "very low" because of serious risk of bias g Includes 6 cross-sectional studies [66,72,81,117,126,127] h No potential confounders were adjusted for in 5 studies [66,72,81,117,127]. Convenience sample in 1 study [117]. No psychometric properties were reported for the subjective physical activity measure in 1 study [126] i Favourable and unfavourable associations between physical activity and cardiometabolic health observed across studies j A gradient for higher TPA with worse total cholesterol was observed in 1 study [81] k Quality of evidence was downgraded from "low" to "very low" because of serious risk of bias and serious inconsistency; because of this limitation, was not upgraded for an exposure/outcome gradient Mean baseline age ranged from 7.4 weeks-24 months; where mean age was not reported, baseline age ranged from 2 months-4.5 years. Data were collected by case cross-over and longitudinal with 4.5-6.5 years follow-up, case control, and cross-sectional study designs. Risks/harm was assessed as injury risk (proxy-report; Participant Event Monitoring method), injury severity (proxy-report; minor injury severity scale), fracture incidence (proxy-report), and plagiocephaly (objectively measured). Outdoor time was favourably associated with fracture incidence in the winter but unfavourably associated with fracture incidence in the summer [129]. TPA was favourably associated with plagiocephaly (at present but not at 6 weeks of age) [130].
Prone position was favourably associated with plagiocephaly (for ≥ 5 min/day but not whether it was provided or not) at 6 weeks of age [130].  [129] e No psychometric properties were reported for outdoor time and fracture incidence, and there was a large unexplained loss to follow-up f Outdoor time was the measure of physical activity g Dose-response evidence was observed for higher outdoor time with lower fracture incidence h Quality of evidence was downgraded from "low" to "very low" because of serious risk of bias and serious indirectness; because of these limitations, was not upgraded for dose-response evidence i Includes 1 case-control study [130] j No psychometric properties were reported for the subjective physical activity measures k Quality of evidence was downgraded from "low" to "very low" because of serious risk of bias l Includes 1 cross-sectional study [131] m Convenience sample and no psychometric properties were reported for the subjective physical activity measure n Quality of evidence was downgraded from "low" to "very low" because of serious risk of bias between leisure physical activity and blood pressure (1/1 study). Structured physical activity was not associated with blood pressure [126]. Similarly, TPA was not associated with cholesterol or triglycerides (See Table 7). The quality of evidence was downgraded from "low" to "very low" because of a serious risk of bias (see Table 7). Among the six cross-sectional studies, physical activity was favourably associated with at least one measure of cardiometabolic health in one study [127] and unfavourably associated with cardiometabolic health in one study [117]; null associations were found in three studies [66,72,126], and mixed findings were found in one study [81]. In the study where some favourable associations were observed, primarily null associations were observed [127]. In regard to intensity or type of physical activity, at least one favourable association was observed between each of the following physical activity exposures and a clustered risk score: TPA (1/1 study) and VPA (1/1 study). However, MPA and MVPA were not associated with a clustered risk score. The following types of physical activity were not associated with blood pressure: TPA, aerobic physical activity, indoor physical activity, outdoor physical activity, leisure physical activity, and structured physical activity. At least one favourable association was observed between TPA and cholesterol (1/2 studies), and at least one unfavourable association was observed between outdoor physical activity and cholesterol (1/1 study). However, indoor physical activity was not associated with cholesterol. Similarly, TPA was not associated with triglycerides (see Table 7). The quality of evidence for the cross-sectional studies was downgraded from "low" to "very low" because of a serious risk of bias and serious inconsistency (see Table 7).

Risks/harm
The association between physical activity and risks/harm was examined in four studies (four unique samples; see Table 8 and Table S8 in Additional file 2). In the case cross-over study, a high activity level, compared to a low activity level, was unfavourably associated with injury risk among toddlers, but activity level was not associated with injury severity [128]. The quality of evidence remained at "low" (see Table 8).
In the longitudinal study, findings differed based on the season, with more outdoor time in the summer associated with an increased likelihood of reporting a fracture, and more outdoor time in the winter associated with a decreased likelihood of reporting a fracture [129]. The quality of evidence was downgraded from "low" to "very low" because of a serious risk of bias and serious indirectness (see Table 8).
In the case-control study, cases (those with plagiocephaly) had an increased likelihood of being in the very inactive/inactive/average group compared to the active/very active group [130]. Cases were also more likely to participate in a lower duration of prone position per day. The quality of evidence was downgraded from "low" to "very low" because of a serious risk of bias (see Table 8).
In the cross-sectional study, no associations were observed between first age of tummy time or tummy time duration and plagiocephaly [131]. Infants with a lower frequency of tummy time were more likely to have plagiocephaly in unadjusted models but not in adjusted models. The quality of evidence was downgraded from "low" to "very low" because of a serious risk of bias (see Table 8).

Frequency and duration
The impact of different frequencies or durations of physical activity on health indicators could be compared only across studies (i.e., within-study comparisons were not possible), as most studies dichotomized physical activity frequency and duration or had only a single-or two-arm physical activity intervention.
Only five observational studies examined the association between frequency of physical activity and a health indicator [56,71,120,121,131]. In one study, participants who engaged in TPA <5 times per week were significantly more likely to have a motor difficulty compared to participants who engaged in TPA >5 times per week [56]. It is important to note that this study measured TPA with a questionnaire; therefore, LPA was likely underestimated or not captured [11]. In one bone and skeletal health study, vitamin D levels were significantly lower in children who participated in 0, 1-5, 6-10, 11-15, or 16-20 times per month, compared to children who participated in outdoor physical activity 26-31 times per month [120]. However, in a second study, only children with no outdoor physical activity were significantly more likely to have lower vitamin D status compared to children who participated in outdoor physical activity 26-31 times per month [121]. In another study, infants who participated in the prone position <3 times per day were significantly more likely to have plagiocephaly in the unadjusted but not in the adjusted models [131]. Similarly, the proportion of girls or boys participating in active play <7 times per week was not significantly different between non-obese and obese groups in one study [71].
As for duration, 17 observational studies examined the association between duration of physical activity and a health indicator [45,47,53,56,58,63,73,74,78,79,98,99,106,120,[129][130][131]. In infants, ≥5 h per day of unrestricted moving time was associated with favourable changes in one measure of adiposity in one study [45]. Additionally, ≥30 min of prone position per day in one study [98] and ≥60 min of prone position per day in another study [99] were associated with more favourable motor development scores or an increased likelihood of achieving motor development milestones at an earlier age. While ≥5 min of prone position per day was favourably associated with plagiocephaly in one study [130], >5 min of prone position per day was not associated with plagiocephaly in another study [131]. For toddlers and/or preschoolers, one study found that those who participated in TPA for ≥7 h per week were significantly less likely to be overweight or obese [63], and a second study found that those who participated in TPA for <840 min per week (<14 h per week) were significantly more likely to have a motor difficulty [56]. In one study, unfavourable associations between ≥2 h of exercise per day and psychosocial health were observed in boys but not in girls [106].
Six studies examined the association between duration of outdoor physical activity and a health indicator [53,58,78,79,120,129]; the findings were mixed. Specifically, >30 min of outdoor physical activity per day was favourably associated with adiposity [58]; ≥1 h per day was favourably associated with bone and skeletal health [120]; and ≥28 h per week were unfavourably associated with risks/harm [129]. Null associations were observed between outdoor physical activity and adiposity in three studies [53,78,79], using a > 1 h per day, ≥2 h per day, and ≥7 h per week cut-points. Null associations were also observed in the remaining three studies that examined duration of physical activity [47,73,74]. For instance, the following durations were not associated with adiposity: ≥2 h of active play per weekday and ≥4 h per weekend day in one study [73], <60 min of VPA per day outside of kindergarten in one study [74], and ≤51.43 min per day of physical activity at home and ≤34.59 min per day of structured physical activity in one study [47].

Discussion
In this systematic review, evidence from 96 studies and 71,291 unique participants was synthesized to examine the relationships between objectively and subjectively measured physical activity and health indicators in the early years. For experimental studies, physical activity was consistently (>60% of studies) associated with improved motor development, cognitive development, psychosocial health, and cardiometabolic health. For observational studies, physical activity was consistently associated with favourable motor development, fitness, and bone and skeletal health. However, physical activity was not consistently associated with adiposity or risks/harm across study designs, and significant differences between intervention and control groups in BMI were not observed in the metaanalysis. Although some high-quality evidence was included, the vast majority of evidence was of "low" to "very low" quality. A high-level summary is provided in Table 9.
Where possible, evidence on the association between the dose (i.e., frequency, intensity, duration, and type) of physical activity and health indicators was also synthesized. Various frequencies of physical activity were associated with health indicators, but the most favourable frequency of physical activity to obtain health benefits was unclear. In regard to intensity of physical activity, LPA and MPA were not consistently associated with any health indicators; whereas TPA, MVPA, and VPA were consistently associated with multiple health indicators. In terms of duration of physical activity, the evidence indicated that for infants, ≥30 min per day in the prone position accumulated throughout waking hours was associated with health benefits. However, for toddlers and preschoolers, the duration of physical activity needed to obtain health benefits was unclear. In regard to type of physical activity, consistent favourable associations with at least one health indicator were observed across multiple studies for a variety of different types of physical activity, including active play, aerobic, dance, prone position (infants), and structured/organized. Finally, some evidence existed to indicate that more physical activity was associated with greater health benefits.
This review builds on a previous systematic review conducted in 2012 that synthesized the evidence from 22 studies on the association between physical activity and health indicators among infants, toddlers, and preschoolers [7]. In contrast to the previous review (where cross-sectional study designs were excluded a priori), the present review included all study designs, thereby greatly increasing the evidence base; specifically, 55 cross-sectional studies were included. Not surprisingly, the evidence base also increased with time: a total of 45 of the studies included in the present study were published in 2012 or later (27 of those were cross-sectional). Due to a more comprehensive search strategy compared to the previous review, 17 additional studies, published in 2011 or earlier were also included. However, 12 studies included in the previous review were excluded in the present review because of changes in inclusion criteria (e.g., sample size) [132][133][134][135][136][137][138][139][140][141][142][143]. Despite the differences in the studies considered, the present review gathered similar results as the previous review [7]: both found that physical activity was favourably associated with motor development, cognitive development, psychosocial health, bone and skeletal health, and cardiometabolic health. This is in line with other reviews that have been published since 2012 on physical activity and single health indicators, including cognitive development in early childhood (aged 0-6 years) [19], psychosocial health in early childhood [21], and motor development in preschoolers [22]. However, it was acknowledged in both the psychosocial health and cognitive development reviews that the evidence was limited [19,21].
In contrast to the review published by Timmons and colleagues in 2012 [7], favourable associations were not consistently observed between physical activity and adiposity in the present review. Although some favourable associations were observed, a large number of null associations were also observed, as well as some unfavourable and mixed associations (Table 9). Furthermore, no significant differences in BMI between intervention and control groups were observed in the meta-analysis of four studies. It is important to note that the bulk of studies used surrogate adiposity measures, such as BMI, whereas bioelectrical impedance [40], air-displacement plethysmography via the pediatric option for the BodPod [55], or dual energy X-ray absorptiometry [46,49,89] were used to measure adiposity in only five out of the 57 studies. Furthermore, the use of subjective physical activity measures that had unknown psychometric properties and neglecting to account for potential confounders (e.g., diet) within analyses were commonly identified risks of bias that may have affected the findings (Table 1). Alternatively, it could be that physical activity is not strongly associated with adiposity in the early years, and other factors such as diet and sleep are more important predictors in this age group [144,145]. This conclusion is partly supported by the clearer evidence for the impact of physical activity on more rapidly developing health indicators such as motor development, psychosocial health, and cognitive development in the present review, where similar risk of bias limitations existed.
To better understand the commonly examined relationship between physical activity and adiposity in the early years, there is no need for more low-quality evidence; rather, what is needed is higher-quality evidence from strong study designs that address current limitations, including the use of objective measures of physical activity, direct measures of adiposity, and study designs or analyses that account for potential confounding factors. Given that adiposity was by far the most commonly studied health indicator, the focus of future high-quality physical activity and adiposity research should be balanced with the need for high-quality research that includes other health indicators in this age group.
Given that the current review included substantially more evidence than previous reviews, sub-analyses on the dose of physical activity, including frequency, intensity, duration, and type were possible. Intensity of physical activity was commonly examined in observational studies and in three experimental studies [111,112,115]. Previous research has shown that most of the physical activity that preschool-aged children participate in is of light intensity [11,16,17]. For instance, it was reported in a systematic review examining objectively measured physical activity and sedentary time that preschoolers spent an average of 2.2 h per day in LPA compared to 47 min per day in MVPA [11]. Interestingly, in the present review, the higher intensities of physical activity (MVPA and VPA), but not the lower intensities of physical activity (MPA and LPA), were consistently associated with multiple health indicators. However, it is important to note that most TPA consists of LPA, and several favourable relationships between TPA and health indicators were observed. Moreover, the majority of this evidence was in preschool-age samples. Overall, these findings suggest that some developmentally appropriate MVPA may be needed for health benefits at least for preschool-aged children, while acknowledging the inherent limitations of accelerometer cut-points to distinguish different intensities of physical activity [146]. In terms of LPA, there is some evidence in youth [147] and in adults [148] that physical activity at the higher end of the LPA spectrum compared to the lower end of the spectrum may be more beneficial for health but this is masked when looking only at total LPA. Future research should examine whether this is also the case for the early years, including infants, toddlers, and preschoolers. Such knowledge will help to determine whether activities at the upper end of the LPA spectrum should be targeted and promoted over lower intensities for health benefits in the early years.
Across both observational and experimental studies in the present review, a wide variety of types of physical activity were examined. The finding that structured/organized physical activity was favourably associated with health indicators in the present review is consistent with a recent systematic review on organized physical activity and health in preschool children [23]. In contrast to another recent systematic review, which reported favourable associations between risky outdoor play (i.e., play where children can disappear/get lost, great heights, rough-and-tumble play) and health [20], primarily null associations were observed between rough-and-tumble play or outdoor play and health indicators in the present study. However, it is important to note that the age groups differed between the two reviews, with the risky outdoor play review including children aged 3-12 years. Furthermore, the outdoor physical activity studies in the present review were not focused on "risky" outdoor play per se. Nevertheless, the favourable associations between a number of different types of physical activity and health indicators suggest that children in the early years should participate in a variety of physical activities for the most health benefits.
It was difficult to draw conclusions about the specific frequency or duration of physical activity that is needed for health benefits because only a small proportion of the included studies examined these dose parameters. Furthermore, most observational studies dichotomized physical activity frequency or duration, and no clear pattern was observed across studies for toddlers and preschoolers. For experimental studies, most involved a single-or two-arm intervention. Additionally, it was not possible to quantify total daily frequency or duration of physical activity because physical activity outside of the intervention was not usually taken into account or even measured. Despite these limitations, sparse but consistent evidence in infants indicates that at least 30 min of prone position or tummy time per day accumulated during waking hours appears beneficial, in particular for motor development. This aligns with the recommendation from the Canadian Paediatric Society, which suggests at least 10-15 min of tummy time, three times per day [149]. Unfortunately, it is not possible to draw specific conclusions on frequency and duration of physical activity for health benefits in toddlers and preschoolers. Current physical activity guidelines in Canada, Australia, and the United Kingdom recommend accumulating 180 min per day of any intensity in these age groups. According to the previously mentioned review by Hnatiuk and colleagues [11], this daily recommendation does align with average prevalence estimates in preschool-aged children (2.2 h or 132 min of LPA + 47 min of MVPA).
Despite not being able to make specific conclusions regarding frequency and duration of physical activity needed for health benefits in toddlers and preschoolers, some evidence existed across age groups that more physical activity is better for health. Specifically, this was supported in 13 studies by dose-response evidence or an exposure/outcome gradient for cross-sectional studies, primarily through continuous data [44,49,55,57,58,67,88,89,100,104,119,123,129]. Furthermore, 20 out of the 24 included experimental studies observed favourable associations with at least one health indicator. Although some behaviour compensation could have occurred, it is likely that the majority of the physical activity accumulated as part of the intervention was in addition to children's baseline physical activity. Therefore, this experimental evidence also supports the overall conclusion that more physical activity is better for health. However, to understand the specific frequency and duration of physical activity needed for health benefits across the early years, further dose-response evidence is needed. Specifically, this should include observational studies that compare multiple categories of physical activity frequency and duration in relation to health indicators, and experimental studies that compare multiple intervention arms with different frequency and duration of physical activity in relation to health indicators. Experimental studies should also take into account baseline physical activity levels.
This discussion has already highlighted a number of research gaps and limitations that need to be addressed by future research; however, there are additional gaps and limitations that also warrant attention. For instance, most of the evidence included in this review was based on preschool-aged children. Given the vast developmental differences in early years age groups [6], the findings observed in preschool-aged children may not be generalizable to infants and toddlers. Therefore, future research should examine the relationships between physical activity and health indicators specifically in infants and toddlers to ensure developmentally appropriate doses of physical activity are being identified, recommended, and promoted. Part of this work may involve determining how to most accurately measure physical activity in infants and toddlers. In fact, objective measures of physical activity were used in only two studies with samples classified as infants or toddlers [86,118], although the measurement of physical activity was a limitation observed across all age groups in the present review. Specifically, as noted in the risk of bias assessments, subjective measures of physical activity with unknown psychometric properties were commonly used. It is known that the sporadic and intermittent nature of physical activity in the early years makes it difficult to accurately capture physical activity with subjective measures [146]. Furthermore, although objective measures of physical activity were used in 35 studies, primarily with accelerometers, heterogeneity in data collection (e.g., monitor placement, epoch length) and reduction (non-wear time definitions, removal of naps, cut-points) procedures across studies may have contributed to the inconsistency of some findings. This may explain why similar conclusions were found across health indicators in the present review when comparing studies that used objective versus subjective measures of TPA as the exposure. Therefore, identifying the most appropriate accelerometer data collection and reduction procedures for early years children should be explored in future research so that these procedures can be standardized across studies. Furthermore, among the 24 experimental studies, 15 did not measure physical activity, so it was unclear if the intervention was in fact successful in changing physical activity levels. Therefore, baseline and follow-up measures of physical activity should be included in future interventions.
Along with the evidence gaps and limitations associated with age groups studied and physical activity measurement, limited studies were available for a number of the health indicators. For example, there were 10 or fewer included studies for each of the following health indicators: psychosocial health, fitness, bone and skeletal health, cardiometabolic health, and risks/harm. Future high-quality research that increases the evidence base for these health indicators is needed. Additionally, while only three studies were included for fitness, some overlap existed between fitness and motor development categories (e.g., standing long jump versus standing broad jump; 12-m run versus 20-m shuttle run). Consensus is needed on what measures constitute fitness versus motor development in this age group.
One strength of the present systematic review was the use of a comprehensive search strategy that was both developed and peer-reviewed by librarians with expertise in systematic reviews. Another strength was the broad scope of the review through the inclusion of all study designs, both subjective and objective measures of physical activity, multiple health indicators, and multiple age groups (i.e. infants, toddlers, and preschoolers). Furthermore, the conduct of sub-analyses on dose of physical activity was a notable strength of the review, as was the meta-analysis of four adiposity interventions. Finally, the use of the established GRADE framework to guide the review and assess the quality of evidence was an additional strength [28].
The present review also had several limitations, including English and French language limits for feasibility, as well as sample size restrictions for both feasibility and generalizability. It is possible that studies published in other languages or with smaller sample sizes might have provided additional insight, especially for health indicators where evidence was limited. Furthermore, while a meta-analysis was conducted on four included studies, due to the large heterogeneity of the study designs and measured outcomes, the majority of findings were based on a narrative synthesis that weighted all studies equally. For some health indicators, conclusions from the narrative synthesis had to be drawn from a small number of studies. Furthermore, it was not possible to do sensitivity analyses between higher-and lower-quality evidence because the vast majority of evidence was "low" to "very low" quality.

Conclusions
This review synthesized evidence from 96 studies on the health implications of physical activity in the early years. Physical activity was consistently found to be favourably associated with a broad range of health indicators. Several types of physical activity, especially prone position for infants, TPA, and physical activity of at least moderate to vigorous intensity, particularly for preschool-aged children, were consistently found to be favourable with a number of health indicators. Although it was not possible to identify the specific frequency and duration of physical activity needed for health benefits in all age groups, it was consistently observed that more physical activity (in terms of frequency or duration) was better for health. Therefore, it can be concluded that it is important to promote physical activity in the early years. The findings of this review will help to inform evidencebased guidelines to facilitate physical activity promotion aimed at optimizing the overall health of our youngest children. Given that the study of physical activity in the early years is still a relatively new area of inquiry, future research should focus on addressing a number of gaps and limitations mentioned in this review, in order to strengthen the evidence base and accurately inform future health promotion efforts.

Additional files
Additional file 1: Search strategies for the systematic review. (DOCX 37 kb) Additional file 2: Supplementary Tables S1-S8. Summary of studies included in the systematic review for each health indicator sorted by (whenever possible) study design, age group, and physical activity measurement. (DOCX 177 kb) Abbreviations BMI: Body mass index; GRADE: Grading of recommendations assessment, development, and evaluation; LPA: Light-intensity physical activity; MPA: Moderate-intensity physical activity; MVPA: Moderate-to vigorousintensity physical activity; PICO: Population, intervention, comparison, and outcome; PRISMA: Preferred reporting items for systematic reviews and meta-analyses; PROSPERO: International prospective register of systematic reviews; RCT: Randomized controlled trial; TPA: Total physical activity; VPA: Vigorous-intensity physical activity