Skip to main content

Pregnancy health in a multi-state U.S. population of systemically underserved patients and their children: PROMISE cohort design and baseline characteristics



Gestational weight gain (GWG) is a routinely monitored aspect of pregnancy health, yet critical gaps remain about optimal GWG in pregnant people from socially marginalized groups, or with pre-pregnancy body mass index (BMI) in the lower or upper extremes. The PROMISE study aims to determine overall and trimester-specific GWG associated with the lowest risk of adverse birth outcomes and detrimental infant and child growth in these underrepresented subgroups. This paper presents methods used to construct the PROMISE cohort using electronic health record data from a network of community-based healthcare organizations and characterize the cohort with respect to baseline characteristics, longitudinal data availability, and GWG.


We developed an algorithm to identify and date pregnancies based on outpatient clinical data for patients 15 years or older. The cohort included pregnancies delivered in 2005–2020 with gestational age between 20 weeks, 0 days and 42 weeks, 6 days; and with known height and adequate weight measures needed to examine GWG patterns. We linked offspring data from birth records and clinical records. We defined study variables with attention to timing relative to pregnancy and clinical data collection processes. Descriptive analyses characterize the sociodemographic, baseline, and longitudinal data characteristics of the cohort, overall and within BMI categories.


The cohort includes 77,599 pregnancies: 53% had incomes below the federal poverty level, 82% had public insurance, and the largest race and ethnicity groups were Hispanic (56%), non-Hispanic White (23%) and non-Hispanic Black (12%). Pre-pregnancy BMI groups included 2% underweight, 34% normal weight, 31% overweight, and 19%, 8%, and 5% Class I, II, and III obesity. Longitudinal data enable the calculation of trimester-specific GWG; e.g., a median of 2, 4, and 6 valid weight measures were available in the first, second, and third trimesters, respectively. Weekly rate of GWG was 0.00, 0.46, and 0.51 kg per week in the first, second, and third trimesters; differences in GWG between BMI groups were greatest in the second trimester.


The PROMISE cohort enables characterization of GWG patterns and estimation of effects on child growth in underrepresented subgroups, ultimately improving the representativeness of GWG evidence and corresponding guidelines.

Peer Review reports


Pregnancy is a critical period for the health of birthing parents and their children. Gestational weight gain (GWG) is an easily and routinely monitored aspect of pregnancy health, with higher and lower levels associated with greater risk of adverse pregnancy and birth outcomes [1], maternal postpartum and long-term chronic conditions [2], and offspring health [3]. Accordingly, GWG guidelines from the Institutes of Medicine (IOM, now National Academies of Medicine) [4, 5] draw from extensive evidence and seek to promote healthy GWG, but were last updated in 2009. Critical evidence gaps remain for future revisions of the guidelines, particularly pertaining to pregnant people who have fewer resources, belong to marginalized racial and ethnic groups, or have pre-pregnancy body mass index (BMI) in the lower or upper extremes (underweight, class II or III obesity) [6]. Additionally, evidence of the effects of timing and magnitude of GWG on longer-term child outcomes is relatively scant.

These gaps are difficult to fill with traditional study design: prospective cohorts with longitudinal pregnancy measures tend to underrepresent people with low incomes or other socially or economically marginalized groups, and few data sources provide child outcomes beyond birth. Electronic Health Records (EHR) are a valuable source of repeated weight measures and outcomes in large study populations [7, 8]. Inclusion of large numbers of pregnancies enables investigation of GWG patterns and their associations with health outcomes in subpopulations that are typically understudied due to inadequate numbers of individuals in each subgroup. Further, while many prior pregnancy studies that utilize EHR-derived data include predominately commercially insured patients [9,10,11], recently developed networks of community-based healthcare organizations (CHCOs) provide rich, longitudinal, clinical data on predominately publicly-insured or uninsured pregnant patients [12]. Yet EHR data pose methodological challenges due to the complexity of clinical data which are not designed for research purposes, particularly in non-integrated care settings.

The PReventing Obesity through healthy Maternal gestational weight gain In the Safety nEt (PROMISE) Study aims to determine overall and trimester-specific GWG associated with the lowest risk of adverse birth outcomes and detrimental infant and child growth in a multi-state U.S. population of CHCO patients. The objectives of this paper are to (1) present the methods and rationale used to (a) construct the PROMISE cohort and (b) develop theory- and data-driven variable definitions with attention to timing relative to GWG and to clinical data collection processes and (2) characterize the cohort with respect to baseline characteristics, longitudinal data availability, and GWG. Throughout this paper, we use neutral weight-related terminology (e.g., high BMI, ≥35 kg/m2) where possible, but also recognize the clinical relevance of “obesity classes” and the ongoing ambiguity about the preferred terminology for reducing weight stigma [13]. We recognize that while most pregnant people identify as women, pregnant people can be of any gender, and some are not yet adults. We primarily use the terms “birthing parent” or “pregnant person”, but also use “maternal” to differentiate characteristics of the pregnant person from their offspring.


ADVANCE Clinical Research Network

The PROMISE Study cohort is derived from OCHIN (not an acronym) data from the Accelerating Data Value Across a National Community Health Center Network (ADVANCE) Clinical Research Network [12]. OCHIN is a nonprofit leader in equitable health care innovation and a trusted partner to a growing national provider network. With a centralized EHR system and the largest collection of community health data in the country, OCHIN conducts practice-based research in clinics located in more than 20 states, building patient and provider engagement in research design and implementation at the grassroots. EHR data contain information from the OCHIN Epic® practice management system (e.g., billing and appointments) as well as demographic, utilization, and clinical data from the full OCHIN EHR, much of which has been standardized for research in ADVANCE. The PROMISE study was approved by the Institutional Review Board at Oregon Health & Science University, the lead site for the study.

Derivation of the PROMISE cohort

In order to study GWG in this unique and understudied patient population, we built upon well-established research using EHR-derived pregnancy cohorts within integrated inpatient and outpatient settings [11, 14, 15]. Because OCHIN data were not linked to inpatient data sources during the study period, our pregnancy algorithm used the extensive outpatient visit data available in OCHIN’s EHR to estimate pregnancy dating needed to define the pregnancy period. In addition, our algorithm defined a cohort for which OCHIN data contain measures, procedures, and diagnoses throughout the study period of interest, enabling longitudinal follow-up of persons regardless of their health insurance status (including lack of insurance or changes in insurance status, common for patients receiving care at CHCOs).

Identification of pregnancies

Pregnancies were identified using a process detailed in Supplementary Material 1: Appendix A and summarized here. Two major data sources in the EHR were used: Pregnancy Episodes of Care (PE) and Encounter-based Pregnancy Records (EPR). An episode of care in the OCHIN EHR is initiated by a provider as a means of providing aggregated information from multiple encounters and data fields for a given clinical condition. A pregnancy-specific episode of care is typically initiated by a medical provider or their support staff at the onset of an individual’s prenatal care. PEs include variables indicating date of last menstrual period (LMP), estimated delivery date (EDD), pregnancy outcome, gestational age (GA) at delivery (when known), and other pregnancy-level information. PEs were used as our primary source of information. The PROMISE team also identified EPRs: pregnancies not associated with a PE. EPRs were defined based on International Classification of Diseases diagnosis codes (ICD-9 and ICD-10) and Current Procedural Terminology (CPT) procedure codes that indicate that the patient was pregnant at the time of an encounter. We used codes that were identified and classified by the Kaiser Permanente (KP) Center for Effectiveness and Safety Research for key attributes including pregnancy outcome, GA, and fetal count. The algorithm was based on work from Hornbrook et al. [16], updated in the KP virtual data warehouse [17] and adapted for OCHIN data by the PROMISE team.

Briefly, in Step 1, a preliminary set of PEs and EPRs were identified among OCHIN records from 1/1/2004 through 1/4/2021, for patients 15 years or older at the time of encounter or pregnancy start. PEs with identical start and end dates were deduplicated, EPRs were assembled from encounter-level data, preliminary start dates were assigned based on outcome date and type (e.g., miscarriage or live birth), and clinical encounters within the preliminary pregnancy start and end dates were extracted. In Step 2, pregnancy start and end dates were refined. Start dates were calculated by subtracting GA at birth, when available, from delivery date; GA at birth is automatically calculated in the OCHIN EHR when both an EDD and delivery date are available. Otherwise, we used the calculated value EDD – 280 days, the latest prior LMP date, or the last recorded encounter diagnosis indicating weeks of gestation, in that order of preference. End dates were refined using birth dates of children who were linked to the pregnant patient within the OCHIN EHR [18], clinician-entered delivery date, or pre- and post-delivery codes, in that order of preference. In Step 3, overlapping or incomplete pregnancy records were removed or consolidated. Self-reported pregnancies noted in a patient’s medical history contain limited patient-reported information (e.g., only pregnancy dates); these pregnancies were excluded if there was no additional information in the EHR. Data sources of the pregnancy records and pregnancy start date and end dates are tabulated in Supplementary Material 1: Appendix A.

Inclusion and exclusion criteria

The PROMISE study included pregnancies delivered between 1/1/2005 and 12/31/2020 among OCHIN health network patients 15 years of age or older at pregnancy start. Pregnancies with GA at delivery between 20 weeks, 0 days and 42 weeks, 6 days were retained; current recommendations [19,20,21] are to induce labor by 41 weeks, though some patients choose to wait until 42 weeks. Therefore, GA longer than 42 weeks, 6 days was considered implausible, likely reflecting inaccurate pregnancy dating, and pregnancies with GA less than 20 weeks were assumed not to be viable. Study inclusion also required availability of ≥ 1 plausible adult height measurement recorded at ≥ 16 years of age and plausible BMI and weight measures required to characterize GWG: ≥ 1 baseline weight measure, ≥ 1 weight measure in the second or third trimester, and ≥ 1 additional weight measure during pregnancy. Plausible height, pre-pregnancy BMI, and weights are defined in Study Variables. Inclusion and exclusion criteria were applied at the pregnancy level, enabling the inclusion of multiple pregnancies per person. We will adjust for within-person correlation in future statistical analyses.

Data linkages

Linkage of parent and child clinical data

At OCHIN, EHR data from parents and children were linked using methods developed and validated as part of a multi-site National Patient-Centered Clinical Research Network (PCORnet) demonstration project [22] and subsequent ADVANCE research [18]. Our linkage methods overcame two data-related challenges in OCHIN data: (a) our data warehouse does not include inpatient data, a common source of maternal-child linkages resulting from hospital-based birth of the child; and (b) as of 2014, Medicaid no longer records household identifiers, which was previously used to link family members in OCHIN clinics located in Oregon [23]. Briefly, explicit, imputed, and fuzzy matches were performed using data available in ADVANCE [18]. Explicit documentation of parent–child relationships included the child ID listed in the parent’s guarantor account information, obstetric claim form, or mother listed in the child’s emergency contact information. Imputed relationships included patient matches on geocoded coordinates for each patient’s last known address, or home phone number. Fuzzy matches compared free-text mother emergency contact demographics against the list of female patients 18 years older in ADVANCE. Parent–child linkages included 66% with explicit linkage, 34% with imputed linkage, and < 1% with fuzzy linkage at the time of the PROMISE Study.

Birth record linkage

The subset of EHR pregnancies observed in California, Oregon, and Washington are being linked to birth record data using LinkPlus, a linkage program developed by the CDC [24] and LinkPlus-described procedures including data standardization, calculation of linkage score, and clerical review of matches with uncertain linkage scores. Birth record linkage details are presented in Supplementary Material 1: Appendix B. As of February 2024, Oregon and California linkages were complete, with pending data acquisition from the state of Washington. Linkage rates were 88% and 89% for Oregon and California, respectively.

GIS data linkage

ADVANCE patient residential addresses are continuously updated and geocoded, mapped to geographic units (e.g., county, census tract), and linked to US Census and other national data sources [25]. In the ADVANCE population, 75.4% of residential addresses have been geocoded to street address level, 1.6% to street segment, and 23.1% to postal/ZIP code levels.

Study variables

Anthropometry of the birthing person

Maternal weights were extracted from EHR encounters. Plausible weights were defined in two stages. First, weights < 36.3 or > 453.6 kg (< 80 or > 1000 pounds) were discarded (n = 323, < 0.01%). Second, we identified pregnancy-specific outliers based on deviation of each measure from temporally adjacent weight measures for the same pregnancy, within each trimester; details are described in Supplementary Material 1: Appendix C. This longitudinal algorithm was adapted from prior work conducted by Sharma and colleagues using pregnancy-related data from the EHR of the Kaiser Permanente Northwest Health Care System [11]. Plausible weights were then used to define baseline and pregnancy weights.

Baseline weight

In order to minimize the number of pregnancies excluded due to lack of available baseline weight, we selected the weight closest to pregnancy start date, within 365 days prior to and 97 days after pregnancy start date. Among included pregnancies, the mean duration between the selected weight and pregnancy start date was 7.80 weeks (SD 6.26 weeks); 66,874 (96%) of baseline pregnancy weight measures were taken within 97 days (< 14 weeks) from pregnancy start, with the majority of these measures (83%) from after pregnancy start. Among 19,006 pregnancies with baseline weight measures from the first 97 days of pregnancy that also had pre-pregnancy weight measurements up to 97 days before the pregnancy, the median absolute difference in pregnancy-specific weights was 1.4 kg (25th, 75th percentile: 0.5, 2.5).

When encounter-level baseline weight was unavailable, we used patient-reported pregravid weight (n = 7,869, 11%), which was contained in a data field specific for this information in the prenatal vitals section of the EHR. In 38,228 pregnancies in which both sources were available, correlation between encounter-level baseline weight and pregravid weight was 0.99.


was defined as the median height among all height measures in the patient’s chart since their 16th birthday. Plausible heights were defined to align with prior literature (48–84 inches [26] [121.9–213.4 cm]).

Baseline Body Mass Index (BMI)

Height and baseline weight were used to calculate baseline BMI, used as an approximation of pre-pregnancy BMI. Plausible BMI was defined based on prior studies (12–100 kg/m2) [27]. BMI was analyzed as both a continuous and categorical variable [28]: Underweight (< 18.5 kg/m2), Normal (18.5 to < 25 kg/m2), Overweight (25 to < 30 kg/m2), Obesity Class I (30 to < 35 kg/m2), Obesity Class II (35 to < 40 kg/m2), and Obesity Class III (≥ 40 kg/m2).

Pregnancy weights

Weights recorded during the pregnancy episode were classified into first (0 to < 14 weeks), second (14 to < 28 weeks), or third (28 weeks-end of pregnancy) trimester.


Total GWG was calculated as last pre-delivery weight minus pre-pregnancy weight, limited to term pregnancies for which the last pre-delivery weight was within 2 weeks of the delivery date. Adherence to IOM guidelines for total GWG was determined based on total GWG and pre-pregnancy BMI category (below, within, or above BMI-specific guidelines), among term pregnancies [5]. Trimester-specific weight gain (total kg, and kg per week) was calculated by fitting a simple linear regression model to measured weights within each trimester for each pregnancy, as described by Abrams & Selvin [29] and applied by others [30]; among pregnancies with two or more observed weights, at least one week apart, in any given trimester. Time coefficients indicate rate of weight gain per week, for each trimester within each pregnancy.

Pregnancy and child outcomes

Gestational age at delivery (GA)

GA at delivery was calculated within the OCHIN EHR for pregnancies with recorded EDD and delivery date (74%); otherwise, we calculated GA at delivery from start and end dates, defined above (Identification of pregnancies). GA is categorized as preterm birth (GA < 37 completed weeks) and term/postterm (GA (≥ 37 completed weeks). Spontaneous and medically indicated preterm birth are secondary outcomes.

Infant size at birth

We extracted birth weight from the PE where available; birth records provide a secondary source within the subset with linked birth records (Oregon, California, Washington deliveries). We defined implausible birth weights based on gestational age- and sex-specific z-scores originally described by Alexander et al. [31] and applied by others [32, 33]: for GA ≥ 37 weeks, < -5 or > 5 SD, and for GA < 37 weeks, < -4 or > 3 SDs calculated within the PROMISE cohort. Size for GA was calculated as an indicator of fetal growth using the reference curve published by Talge et al. [32], enabling calculation of both categorical and continuous measures of birth size, and incorporating clinical estimates of GA, which are more accurate than LMP and readily available in the EHR. Size for GA will be examined in primary analysis as small [< 10th percentile; SGA], appropriate, and large for gestational age [> 90th; LGA]) [31, 34, 35]; and in secondary analysis, using lower SGA [36] or higher LGA [37] cut points that are more clinically meaningful, and as a semi-continuous variable [38]. For comparison with prior literature, we will examine birth weight as a secondary outcome, classified as very low (< 1.5 kg; VLBW), low (< 2.5 kg; LBW), normal, and high (> 4.5 kg; HBW) birth weight.

Child anthropometry

We extracted child body weight and length or height, measured on the same day, from clinical encounter records through 1/12/2023. Length (for children < 24 months) and height (for children ≥ 24 months) were recorded in a single field. Longitudinal data availability for child weights and heights for the ADVANCE population has been previously reported [39] and will be presented for the PROMISE cohort in forthcoming outcomes analyses.

We will examine infant growth from birth to 18 months of age, which includes the infant BMI peak (typically at 8–9 months but as late as 17 months) [40, 41]. We will examine growth both in length (cm) and weight (kg), because drivers of weight gain versus length increase differ [42, 43], and explore differences in children born preterm versus full term. For descriptive analysis, we will calculate weight-for-age and length-for-age percentiles and z-scores relative to the World Health Organization (WHO) standard growth curve [44]. We will also explore early life changes in BMI; BMI assesses weight independent of height [45], an indirect measure of adiposity, and it performs better than weight-for-length, even in very young children [46, 47].

To approximate weight status at the critical transition period of school entry, we selected height and weight measured closest to 5 years of age, among measures when children were 4 to < 6 years of age. BMI was calculated and converted to age- and sex-specific BMI percentiles using Centers for Disease Control and Prevention (CDC) 2000 growth curves [48], which is recommended for children 2 years or older [24]. Our primary outcome is continuous BMI z-score, with extended BMI z-score as an alternate outcome that may perform better in children with very high BMI [49]. Our secondary outcome is BMI classification: underweight (< 5th percentile), normal weight (5th to < 85th), overweight (85th to < 95th), obesity (95th to < 20% higher than 95th), and severe obesity (≥ 20% higher than 95th percentile) [50].


We anchored the identification and definition of covariates to our causal framework (Fig. 1).

Fig. 1
figure 1

Conceptual framework

Age of the pregnant person at the EDD was calculated by subtracting the date of birth of the birthing person from the EDD. Clinical care processes are based on age at EDD; for example, designation of pregnancies as having “advanced maternal age” when the birthing person is ≥ 35 years of age at expected delivery.

Race and ethnicity are recorded in the patient table in separate fields. In contrast to concerns about missing race and ethnicity information in claims and clinical data [51, 52], CHCOs are federally required to report race and ethnicity, supporting a high degree of completeness in our study population. We combined race and ethnicity into a single variable based on recent guidelines [53], which recognize that race categories typically used in the U.S. are not interpretable for many Hispanic people, resulting in misclassification into white or “other” race or a high level of missingness in the race variable [54]. We created the following race and ethnicity categories, which use the most granular race categories recorded in the EHR: Hispanic, Non-Hispanic [NH] American Indian/Alaska Native, NH Asian, NH Black, NH Native Hawaiian/Other Pacific Islander, NH other/multiple, NH white, unknown. For secondary analyses, more granular race categories will be obtained from birth records among the subset with linked birth record data. We examine race and ethnicity as social variables [53, 55], reflecting a constellation of social, cultural, historical, and interpersonal processes that impact family resources, individual behaviors, experience of psychosocial stress, and biased clinical care delivery that can impact body weight and/or pregnancy and child health.

Preferred spoken language is recorded in the patient demographics table (English, Spanish, other, unknown). We examine preferred language as a proxy for cultural and social factors that influence body weight and pregnancy and child health, as well as an indicator of language barriers that can influence clinical care.

State of residence was obtained from the patient demographics table and reflects the most recently reported state of residence at the time of data extraction. For descriptive purposes, states were classified as Oregon, Washington, California, and other; these categories reflect the predominant states represented in the PROMISE study population. We examine state of residence as a proxy for geographically-patterned determinants of pregnancy and child health, including variations in clinical practice across states.

Education at the time of delivery and parity are available in the birth record, among the subset with linked birth records. Education will be categorized as less than high school, high school, some college, college completion or higher and examined as a dimension of socioeconomic position. Parity is a known determinant of body weight and pregnancy and child health; it will be categorized as nulliparous or multiparous.

Income level, payer type, and smoking status were obtained from encounter-level data and were defined based on several considerations: the time period(s) of interest, degree of missingness in the target time period, and frequency and process of data collection in CHCOs.

Income as a percentage of the federal poverty level (%FPL) is collected by most CHCOs for reimbursement purposes. Household income (USD), state of residence, most recent family size, and year- and region-specific U.S. poverty guidelines are used to calculate %FPL. Given that income destabilizes at around mid-pregnancy in the general population [56], we sought to measure %FPL prior to or early in pregnancy to establish temporal order prior to the exposure of interest. However, defining %FPL within a specific perinatal period is not possible with the existing data collection process: last known %FPL is requested from patients approximately every 6–12 months. Thus, we selected the %FPL value recorded closest to pregnancy start, within 365 days prior to pregnancy through the end of pregnancy. This time frame reduced the number of pregnancies with missing data while, based on preliminary analysis, approximating FPL early in pregnancy. For the purposes of this paper, we classified FPL as ≤ 100%, 101–200%, > 200%, or unknown to characterize the income levels in this cohort.

Payer type is recorded at each encounter. We classified payer type as public (Medicaid and Medicare), private, other non-comprehensive insurance (e.g., worker’s compensation, auto, life, farmer’s insurance, private plans specific to dental/vision care, and grant/pilot study coverage), or uninsured. Our goal was to capture insurance and payer status most likely to impact pregnancy health, while recognizing the temporal patterns in insurance throughout pregnancy and that intermittently recorded payer changes are unlikely to reflect actual changes in payer type. Therefore, we selected the predominant payer type throughout the second and third trimesters, defined as the payer type reported at the greatest number of visits. If there was an equal number of visits with multiple payer types, the following hierarchy was applied: Medicaid, Medicare, other public, private. This approach was informed by preliminary analysis showing that payer type changes in early pregnancy, largely due to Medicaid eligibility expansion during pregnancy [57]. Further, we do not expect that first trimester weight gain would influence changes in insurance; that is, second and third trimester payer type is unlikely to be a mediator of the association between GWG and child outcomes.

Tobacco use is collected at each encounter as required by EHR-Meaningful use [58]; here, we report use of any tobacco product (smoking or smokeless tobacco). Given well-known changes in tobacco use during pregnancy, potential time-specific effects of tobacco use on weight gain, and preliminary analysis that indicated frequent updates and suggested that changes from current to former user are maintained for the remainder of the pregnancy, we defined tobacco use within two time periods: pre-pregnancy to early pregnancy (365 days prior to pregnancy start through 12 weeks of gestation) and mid-late pregnancy (13 weeks of gestation through the end of pregnancy). In each time period, tobacco use was classified as current (current user at ≥ 1 encounters during the time period), former (no reports of current usage, former user at ≥ 1 encounters during the time period), never (no reports of current or former usage, never or passive/environmental use at ≥ 1 encounters during the time period) user, or unknown. Fewer than 1% reported passive/environmental use, likely reflecting substantial under-reporting; therefore, we are unable to examine passive/environmental use as a separate category.

Maternal conditions include pregestational and gestational diabetes mellitus (DM), pre-existing hypertension (HT), and other hypertensive disorders of pregnancy (HDP, including gestational hypertension, preeclampsia, and eclampsia). Pregestational DM was defined as 2 or more encounters with ICD 9 or 10 code indicating DM, or any DM codes on the problem list with onset before pregnancy. GDM was defined as 2 or more GDM ICD 9 or 10 codes in the encounter table during pregnancy or any GDM code on the problem list. HT and other HDP were identified using ICD 9 and 10 codes in encounters or the problem list and evidence of elevated blood pressures in the clinical record. GDM and HDP (excluding HT) are considered mediators in our conceptual framework and will be used in secondary or sensitivity analyses.

Child sex (male, female, unknown) is available for PE records or with patient demographic information among children who are also OCHIN patients.

Community-level variables were obtained from linked GIS data. The residential period of interest is during pregnancy; we extracted GIS data corresponding to the residential location recorded closest to the start of pregnancy. Community-level variables include sociodemographic composition (racial composition, median income, US Census tract); modified Retail Food Environment Index (census tract, CDC); recreation facilities, fast food restaurants, food stores (census tract, US Census Business Patterns), and air quality (County, Environmental Protection Agency).

Statistical analysis

We conducted a series of descriptive and longitudinal analyses that inform key factors required to investigate GWG trajectories within pre-pregnancy BMI categories in our unique population.

Study population characteristics

We describe the baseline sociodemographic and clinical characteristics of our pregnancy cohort and those excluded from the cohort. Given our objective of estimating effects of GWG trajectories on pregnancy and child outcomes among birthing persons across the spectrum of body size, we also present descriptive characteristics within each pre-pregnancy BMI classification. We focus on the magnitude of group differences rather than statistical testing, given our large sample size.

Frequency and timing of GWG data collection

We evaluated the number and timing of pregnancy weight measures, which influence the ability to characterize and analyze GWG trajectories. Specifically, we calculated the average number of weight measures available for each pregnancy, within the total pregnancy period and within trimesters. Data characteristics for children in the OCHIN health system were reported in a previous publication [39].


Derivation of the study cohort

The pregnancy algorithm identified 103,366 pregnancies that started between 4/16/2004 and 7/6/2020 among OCHIN health network patients 15 years of age or older at the start of pregnancy (Fig. 2). Among these, 1,942 pregnancies (1.9%) were excluded due to GA less than 20 weeks or over 43 weeks at delivery (< 200 or > 426 weeks). Exclusions due to lack of required anthropometry data included 1,650 (1.6%) without a known height, and, among the remaining 99,774 pregnancies, 18,430 (18.5%) without a baseline weight measure, 1,688 (2.1%) with no measure in either the second or third trimester, and 2,057 (2.6%) with no additional measure during pregnancy. Thus, the PROMISE cohort includes 77,599 pregnancies lasting 20 to 42 weeks with adequate height and weight measures needed to examine GWG patterns.

Fig. 2
figure 2

Flow chart

Sociodemographic and baseline characteristics

PROMISE cohort members had a mean age of 27.9 years at delivery, spanning from < 20 years (8.0%) to 40 years and older (3.2%) (Table 1). The largest race and ethnicity group was Hispanic (56.5%), followed by NH white (22.7%) Black (12.1%), and Asian (4.2%). The study population included small proportions (< 1%) but substantial absolute numbers (n > 300) of people with NH Native Hawaiian/Pacific Islander or American Indian/Alaska Native race and ethnicity. Over half (53.0%) of the sample had incomes below the poverty level and most had public insurance (82.2%), though with substantial groups who were uninsured (7.6%) or with private insurance (9.5%). The most common preferred spoken languages were English (55.0%) and Spanish (37.8%). Cohort members resided predominately in California (38.5%) and Oregon (27.2%), with the remainder from Washington or other states (6.3 and 27.9%, respectively).

Table 1 Characteristics of pregnant individuals, per pregnancy, in the PROMISE Study Populationa

Compared to included pregnancies, excluded pregnancies were slightly younger, had lower income, were more likely to be uninsured or have unknown insurance type, and were more likely to live in Washington (Table 1). However, the overall racial/ethnic composition was similar between included and excluded pregnancies. Tobacco use is more likely to be unknown in excluded pregnancies. Utilization patterns were consistent with exclusions due to lack of pre-pregnancy or pregnancy weights, as well as the focus on pregnancies for which OCHIN clinics provided prenatal care. That is, the first pregnancy encounter occurred after the first trimester in the vast majority of excluded pregnancies (85.8%); 13.4% of excluded pregnancies had no pregnancy encounters in the second or third trimesters.

PROMISE cohort members span the full spectrum of pre-pregnancy BMI: 2.1% underweight, 33.8% normal weight, 31.3% overweight, 32.7% obesity (18.9% Class I, 8.4% Class II, 5.4% Class III) (Table 2). In general, those with incrementally higher BMI tended to be older, have greater representation of Hispanic and Black patients, have lower income, and were more likely to have public insurance. In two exceptions, Black race and the lowest income level (≤50% FPL) were also more common in those with underweight. Current tobacco use prior to pregnancy was highest in those with underweight or Class III obesity, while tobacco use during pregnancy was highest in those with underweight. State of residence and utilization patterns were similar across BMI categories, although those with obesity were slightly more likely to have an early pregnancy (< 6 weeks) encounter.

Table 2 Pre- or early-pregnancy characteristics of included pregnant individuals, per pregnancy, in the PROMISE Study Population, stratified by pre-pregnancy BMI category

Longitudinal data characteristics

We examined aspects of longitudinal data availability that impacts the ability to calculate total and trimester-specific GWG using observed weight measures, both overall and within each pre-pregnancy BMI category (Table 3). In the overall cohort, the last observed pregnancy weight measure was recorded in encounters a median of 5 days prior to delivery, and within 2 weeks of delivery for 78.0% of pregnancies. A median of 2, 4, and 6 valid weight measures were available in the first, second, and third trimesters, respectively. The number of available weight measures varied substantially: for example, 10% had only 1 measure while 10% had ≥9 measures in the third trimester. 68.0, 88.2, and 88.0% of the cohort had a sufficient number of weight measures (≥2 measures within any given trimester) needed to calculate rate of weight gain within the first, second, and third trimesters, respectively. In our data, we observed substantial variability due to calculation of weight gain rates based on closely spaced measures; therefore, we also report number and percent of pregnancies with at least 2 measures, at least one week apart, within any given trimester: 63.9, 87.2, and 88.0% of pregnancies in the first, second, and third trimesters, respectively. The availability of pregnancy weights was generally similar across pre-pregnancy BMI categories.

Table 3 Longitudinal follow-up of included pregnant individuals in the PROMISE Study Population

Gestational weight gain

Among term pregnancies with a weight within 2 weeks prior to the delivery date (n = 56,503), mean total GWG calculated from observed weights was 11.8 kg (Table 4). Within this subset, total GWG was below IOM/NAM recommendations for 25.7% of pregnancies and above recommendations in 42.6% of pregnancies. Among all pregnancies with ≥2 measures at least one week apart within a given trimester (n = 49,569, 67,708, and 67,822, respectively), weekly rate of weight gain was 0.00, 0.46, and 0.51 kg per week in the first, second, and third trimesters. Total GWG and second trimester GWG were incrementally lower with higher pre-pregnancy BMI; this pattern was also reflected in percentages gaining below or above IOM/NAM recommendations. In the first trimester, those with underweight exhibited the greatest weight gain, while average weight loss was observed in those with obesity class I, II, and III. Third trimester GWG was more similar across BMI categories, though with slightly lower GWG with higher BMI.

Table 4 Total and trimester-specific GWG calculated from observed weights among included pregnancies [mean (SE) unless otherwise noted]


The PROMISE cohort is a unique pregnancy cohort of over 77,000 systemically underserved patients. We leveraged outpatient data from a national network of CHCOs to identify and date pregnancies, then extracted and used longitudinal anthropometric and other clinical measures to create study variables that align with our conceptual framework and clinical data collection processes. Our study population provides substantial numbers of understudied subgroups, including racial and ethnic groups traditionally underrepresented in research, very low-income groups, and uninsured patients. These data also provide extensive longitudinal measures needed to characterize and study GWG across the BMI spectrum, including underweight and obesity classes II and III.

A key contribution of this study is the derivation of the PROMISE cohort – including identification of pregnancies and follow-up in linked children – from CHCO outpatient records. The OCHIN network of CHCOs serves an exceptionally large and diverse patient population, using a data structure that provides more longitudinal detail as compared to most existing pregnancy-research data resources (i.e., not relying only on inpatient data and/or health care claims). However, lack of inpatient data does provide unique methodological challenges for pregnancy research, due to the lack of childbirth-related claims data. Our pregnancy algorithm can inform construction of similar cohorts in other patient populations without integrated hospital data.

Indeed, the PROMISE cohort provides greater representation of lower resourced, more racially and ethnically diverse patients than most existing cohorts. Among PROMISE cohort members, 69% were Hispanic or non-Hispanic Black, 90% were publicly insured or uninsured, and over half had incomes below the federal poverty level. In comparison, pregnancy cohorts derived from EHR data from integrated care organizations include, for example, 9 to 31% Hispanic or non-Hispanic Black, with ≤5% Medicaid patients [11, 14]. Other EHR-derived cohorts such as samples from the Magee Women’s Hospital in Pittsburgh [34] offer different dimensions of diversity (28% Black, 40% public insurance). Prospective cohorts are often higher SES (e.g., 63.3% [59] or 44% [60] college graduate or higher education, 64.8% with incomes 350% FPL or higher [61]), though with notable examples of high representation in single-site studies [62, 63] or national studies that either do not follow children after birth [64] or were recruited prior to the rise in obesity prevalance [65, 66].

A second key contribution is our set of explicit, theory- and data-driven variable definitions with attention to timing relative to the pregnancy-related exposure and clinical data collection processes. EHR data are derived from clinical visits that occur with variable frequency, determined by a complex set of factors including health status, health care access, and personal and structural barriers [7]. EHR data are also influenced by clinical workflow and structure of the EHR platform. As a result, data availability within specific time periods can be sparse or, in cases like “last known FPL”, misleading. This is particularly pertinent for pregnancy research: pregnancy is a period of dynamic clinical, behavioral, and social changes, and factors during specific perinatal time frames have distinct impacts on health of the pregnant person and the offspring. We anchored our study variable definitions in an explicit conceptual framework and adapted the definitions to the realities of data availability and data collection processes in our CHCO context. By including detailed definitions and rationale, we hope these processes can be applied and tested further in other EHR-based study populations.

A third contribution is the quantification of the data needed to measure GWG across the BMI spectrum. The PROMISE cohort has a median of 2, 4, and 6 weight measures in the first, second, and third trimesters, respectively, with similar availability across BMI categories. These data are sufficient for calculating trimester GWG using traditional methods, but also provide a foundation for minimizing study exclusions with modeled data in future studies. These observed and modeled longitudinal data will enable us to fill longstanding and broadly recognized knowledge gaps about GWG in those with Class II and III obesity, as well as underweight. These gaps are largely due to insufficient sample sizes in most studies, requiring the combination of obesity class II and III together, or exclusion of underweight. In the PROMISE cohort, we have the ability to examine GWG and other pregnancy characteristics for the most vulnerable groups: for example, those with high BMI and with fewer resources to support behavioral or clinical needs; and those with underweight, who may have additional risk factors such as tobacco use or food insecurity, and without sufficient resources to overcome them.


We recognize limitations of the PROMISE cohort, stemming primarily from the reliance of EHR data on information recorded at clinical encounters at OCHIN clinics. This issue has several implications for potential biases. First, clinical data availability impacted the selection into the cohort: lack of availability of a baseline weight measure (18.4%) was the largest reason for exclusion, driven by typically sparse clinical visits prior to pregnancy. We minimized this exclusion by expanding the time window within which we accepted baseline weights and by incorporating pregravid weight, which is patient-reported at an initial prenatal care visit. Furthermore, while pregnancies that were included in the PROMISE cohort showed some differences in baseline characteristics compared to those that were excluded, these differences were slight, with minimal differences in age, race and ethnicity, and spoken language. Second, missing data within the defined cohort were notable for some variables, particularly for pre- or early-pregnancy characteristics. We minimized missing data through our conceptually- and data-driven process and will incorporate imputation methods in future analyses. Third, we lacked data from care received outside of the OCHIN network, including specialized or inpatient care. EHR-based algorithms for pregnancy identification typically include inpatient diagnoses [67, 68]; while we conducted extensive exploratory analysis to appropriately identify diabetes mellitus and hypertension prior to and during pregnancy using outpatient data, validity of these measures would be improved with inpatient data. Additionally, patients with comorbidities may be referred to specialized care outside of the OCHIN network. Those referred prior to or early in pregnancy could be excluded from our study population, while those referred later in pregnancy may be lost to follow-up, or potentially misclassified (resulting in, for example, underascertainment of pregnancy hypertension). Lastly, we were unable to ascertain diagnoses or procedures that occurred during hospital admissions, including the childbirth hospitalization (e.g., induction of labor, severe maternal morbidity).

Limitations also include the absence of information on key confounders that are not available in the clinical record, such as diet, infant feeding, or other behavioral or contextual influences; however, we draw from external data sources including birth records and GIS data where possible. We acknowledge that pregnancy and child weights and BMI are indirect estimates of adiposity, with systematic measurement error related to race and ethnicity. Finally, OCHIN clinics share a clinical data system (OCHIN Epic®) but are otherwise independent entities. Most OCHIN clinics are Federally Qualified Health Centers, for which funding is tied to data collection, metrics, and certain types of performance, but there remains variation in clinical policies, processes, characteristics, and norms across clinics.

Next steps and future directions

The PROMISE cohort will support future research needed to inform future revisions of GWG guidelines, which will synthesize evidence informing optimal ranges of GWG that balance a wide range of maternal and child risks. Next steps include the characterization of GWG trajectories and estimation of their effects on child growth, at birth, in infancy, and at the time of school entry. These planned analyses will provide evidence with representation of socially marginalized populations and adequate sample size across all BMI categories.

This research project provides opportunities for future integration of additional data to improve measurement and confounding adjustment, and it supports investigation of additional determinants or outcomes of GWG. For example, follow-up maternal or child data could be incorporated for additional calendar years as data become available. Linkage of inpatient clinical or claims data would enable more robust characterization of pregnancy outcomes and in-hospital procedures or diagnoses. Incorporation of laboratory and other clinical data and natural language processing methods would support creation and validation of additional complex variables, such as asthma, heart disease, or autoimmune disease. Death certificate data can be linked to ascertain the rare outcomes of fetal or neonatal death, in order to quantify potential selection bias or examine as outcome variables. Within a subsample, primary collection of data on infant feeding, household environment, and other behavioral and environmental variables would enable investigation of behavioral pathways and outcomes that occur outside of the clinical setting.


The PROMISE cohort enables the estimation of effects of GWG rate and timing on child outcomes in subgroups that lack a robust evidence base needed to form guidelines for pregnancy weight gain: those belonging to socially marginalized racial and ethnic populations, who are uninsured, publicly insured, discontinuously insured, or with low or high BMI. The cohort provides extensive longitudinal weight measures throughout pregnancy and, as shown in a previous publication [39], in the offspring. With these unique data, we will characterize GWG patterns and their estimated effects on child growth, ultimately improving the representation of GWG evidence and corresponding guidelines.

Availability of data and materials

Raw data underlying this article were generated from multiple health systems across the OCHIN network; restrictions apply to the availability and re-release of data under organizational agreements. Researchers interested in accessing the study data can find relevant information at”.



Accelerating Data Value Across a National Community Health Center Network


Body Mass Index


Community-based Healthcare Organization


Current Procedural Terminology


Estimated Delivery Date


Electronic Health Record


Encounter-based Pregnancy Records


Federal Poverty Level


Gestational Age


Geographic Information System


Gestational Weight Gain


International Classification of Diseases


Institutes of Medicine


Kaiser Permanente


Last Menstrual Period


National Academy of Medicine




(Not an acronym)


Pregnancy Episodes of Care


PReventing Obesity through healthy Maternal gestational weight gain In the Safety nEt


  1. Goldstein RF, Abell SK, Ranasinha S, et al. Association of gestational weight gain with maternal and infant outcomes: a systematic review and meta-analysis. JAMA. 2017;317(21):2207–25.

    Article  PubMed  PubMed Central  Google Scholar 

  2. Catalano PM, Shankar K. Obesity and pregnancy: mechanisms of short term and long term adverse consequences for mother and child. BMJ. 2017;356:j1.

    Article  PubMed  PubMed Central  Google Scholar 

  3. Lau EY, Liu J, Archer E, McDonald SM, Liu J. Maternal weight gain in pregnancy and risk of obesity among offspring: a systematic review. J Obes. 2014;2014:524939.

    Article  PubMed  PubMed Central  Google Scholar 

  4. American College of O, Gynecologists. ACOG Committee opinion no. 548: weight gain during pregnancy. Obstet Gynecol. 2013;121(1):210–2.

    Article  Google Scholar 

  5. Institute of Medicine and National Research Council (IOM and NRC). Weight Gain During Pregnancy: Reexamining the Guidelines. Washington, DC: The National Academies Press; 2009.

  6. Siega-Riz AM, Bodnar LM, Stotland NE, Stang J. The current understanding of gestational weight gain among women with obesity and the need for future research. NAM Perspectives. Washington, DC: Discussion Paper, National Academy of Medicine; 2019.

  7. Casey JA, Schwartz BS, Stewart WF, Adler NE. Using electronic health records for population health research: a review of methods and applications. Annu Rev Public Health. 2016;37:61–81.

    Article  PubMed  Google Scholar 

  8. Aris IM, Lin PD, Rifas-Shiman SL, et al. Association of early antibiotic exposure with childhood body mass index trajectory milestones. JAMA Netw Open. 2021;4(7):e2116581.

    Article  PubMed  PubMed Central  Google Scholar 

  9. Sridhar SB, Darbinian J, Ehrlich SF, et al. Maternal gestational weight gain and offspring risk for childhood overweight or obesity. Am J Obstet Gynecol. 2014;211(3):259 e1-8.

    Article  PubMed  Google Scholar 

  10. Hillier TA, Ogasawara KK, Pedula KL, Vesco KK, Oshiro CES, Van Marter JL. Timing of Gestational Diabetes Diagnosis by Maternal Obesity Status: Impact on Gestational Weight Gain in a Diverse Population. J Womens Health (Larchmt). 2020;29(8):1068–76.

    Article  PubMed  PubMed Central  Google Scholar 

  11. Sharma S, Vesco KK, Bulkley J, et al. Associations of gestational weight gain with preterm birth among underweight and normal weight women. Matern Child Health J. 2015;19(9):2066–73.

  12. DeVoe JE, Gold R, Cottrell E, et al. The ADVANCE network: accelerating data value across a national community health center network. J Am Med Inform Assoc. 2014;21(4):591–5.

    Article  PubMed  PubMed Central  Google Scholar 

  13. Puhl RM. What words should we use to talk about weight? A systematic review of quantitative and qualitative studies examining preferences for weight-related terminology. Obes Rev. 21(6):e13008.

  14. Badon SE, Quesenberry CP, Xu F, Avalos LA, Hedderson MM. Gestational weight gain, birthweight and early-childhood obesity: between- and within-family comparisons. Int J Epidemiol. 2020;49(5):1682–90.

    Article  PubMed  PubMed Central  Google Scholar 

  15. MacDonald SC, Bodnar LM, Himes KP, Hutcheon JA. Patterns of gestational weight gain in early pregnancy and risk of gestational diabetes mellitus. Epidemiology. 2017;28(3):419–27.

    Article  PubMed  PubMed Central  Google Scholar 

  16. Hornbrook MC, Whitlock EP, Berg CJ, et al. Development of an algorithm to identify pregnancy episodes in an integrated health care delivery system. Health Serv Res. 2007;42(2):908–27.

    Article  PubMed  PubMed Central  Google Scholar 

  17. Research Data & Analytics. Accessed 1 Mar 2024.

  18. Angier H, Giebultowicz S, Kaufmann J, et al. Creation of a linked cohort of children and their parents in a large, national electronic health record dataset. Medicine (Baltimore). 2021;100(32):e26950.

    Article  PubMed  Google Scholar 

  19. Practice bulletin no. 146: Management of late-term and postterm pregnancies. Obstet Gynecol. 2014;124(2 Pt 1):390–6.

    Article  Google Scholar 

  20. Kenyon S, Middleton L, Skrybant M, Johnston T. When to induce late term pregnancies. BMJ. 2019;367:l6486.

    Article  PubMed  Google Scholar 

  21. Wennerholm UB, Saltvedt S, Wessberg A, et al. Induction of labour at 41 weeks versus expectant management and induction of labour at 42 weeks (SWEdish Post-term Induction Study, SWEPIS): multicentre, open label, randomised, superiority trial. BMJ. 2019;367:l6131.

    Article  PubMed  PubMed Central  Google Scholar 

  22. Block JP, Bailey LC, Gillman MW, et al. PCORnet antibiotics and childhood growth study: process for cohort creation and cohort description. Acad Pediatr. 2018;18(5):569–76.

    Article  PubMed  PubMed Central  Google Scholar 

  23. Angier H, Gold R, Crawford C, et al. Linkage methods for connecting children with parents in electronic health record and state public health insurance data. Matern Child Health J. 2014;18(9):2025–33.

    Article  PubMed  PubMed Central  Google Scholar 

  24. National Center for Health Statistics. Accessed 11 Mar 2024.

  25. Bazemore AW, Cottrell EK, Gold R, et al. “Community vital signs”: incorporating geocoded social determinants into electronic records to promote patient and population health. J Am Med Inform Assoc. 2016;23(2):407–12.

    Article  PubMed  Google Scholar 

  26. Das SR, Kinsinger LS, Yancy WS Jr, et al. Obesity prevalence among veterans at Veterans Affairs medical facilities. Am J Prev Med. 2005;28(3):291–4.

    Article  PubMed  Google Scholar 

  27. Cheng FW, Gao X, Mitchell DC, et al. Body mass index and all-cause mortality among older adults. Obesity (Silver Spring). 2016;24(10):2232–9.

    Article  PubMed  Google Scholar 

  28. CDC. Defining Adult Overweight and Obesity. 6/3/22.

  29. Abrams B, Selvin S. Maternal weight gain pattern and birth weight. Obstet Gynecol. 1995;86:163–9.

    Article  CAS  PubMed  Google Scholar 

  30. Margerison-Zilko CE, Shrimali BP, Eskenazi B, Lahiff M, Lindquist AR, Abrams BF. Trimester of maternal gestational weight gain and offspring body weight at birth and age five. Matern Child Health J. 2012;16(6):1215–23.

    Article  PubMed  Google Scholar 

  31. Alexander GR, Himes JH, Kaufman RB, Mor J, Kogan M. A United States national reference for fetal growth. Obstet Gynecol. 1996;87(2):163–8.

    Article  CAS  PubMed  Google Scholar 

  32. Talge NM, Mudd LM, Sikorskii A, Basso O. United States birth weight reference corrected for implausible gestational age estimates. Pediatrics. 2014;133(5):844–53.

    Article  PubMed  Google Scholar 

  33. Aris IM, Kleinman KP, Belfort MB, Kaimal A, Oken E. A 2017 US reference for singleton birth weight percentiles using obstetric estimates of gestation. Pediatrics. 2019;144(1)

  34. Bodnar LM, Hutcheon JA, Platt RW, Himes KP, Simhan HN, Abrams B. Should gestational weight gain recommendations be tailored by maternal characteristics? Am J Epidemiol. 2011;kwr064 [pii].

  35. Chauhan SP, Magann EF, Zhao Y, Klimpel JM, Brown JA, Morrison JC. Maternal body mass index: a poor diagnostic test for detection of abnormal fetal growths. Am J Perinatol. 2011.

    Article  PubMed  Google Scholar 

  36. Ray JG, Park AL, Fell DB. Mortality in Infants affected by preterm birth and severe small-for-gestational age birth weight. Pediatrics. 2017;140(6).

  37. Weissmann-Brenner A, Simchen MJ, Zilberberg E, et al. Maternal and neonatal outcomes of large for gestational age pregnancies. Acta Obstet Gynecol Scand. 2012;91(7):844–9.

    Article  PubMed  Google Scholar 

  38. Andrea SB, Messer LC, Marino M, Goodman JM, Boone-Heinonen J. A Nationwide Investigation of the Impact of the Tipped Worker Subminimum Wage on Infant Size for Gestational Age. Prev Med. 2020;133:106016.

  39. Boone-Heinonen J, Tillotson CJ, O’Malley JP, et al. Characterizing a “Big data” cohort of over 200,000 low-income U.S. infants and children for obesity research: The ADVANCE Early Life Cohort. Matern Child Health J. 2017;21(3):421–31.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Silverwood RJ, De Stavola BL, Cole TJ, Leon DA. BMI peak in infancy as a predictor for later BMI in the Uppsala Family Study. Int J Obes (Lond). 2009;33(8):929–37.

    Article  CAS  PubMed  Google Scholar 

  41. Jensen SM, Ritz C, Ejlerskov KT, Molgaard C, Michaelsen KF. Infant BMI peak, breastfeeding, and body composition at age 3 y. Am J Clin Nutr. 2015;101(2):319–25.

    Article  CAS  PubMed  Google Scholar 

  42. Lampl M, Mummert A. Historical approaches to human growth studies limit the present understanding of growth biology. Ann Nutr Metab. 2014;65(2–3):114–20.

    Article  CAS  PubMed  Google Scholar 

  43. Perng W, Rifas-Shiman SL, Kramer MS, et al. Early weight gain, linear growth, and mid-childhood blood pressure: a prospective study in project viva. Hypertension. 2016;67(2):301–8.

    Article  CAS  PubMed  Google Scholar 

  44. WHO (World Health Organization). The WHO Child Growth Standards. Available at: Accessed 15 May 2013.

  45. Cole TJ. Weight/heightp compared to weight/height2 for assessing adiposity in childhood: influence of age and bone age on p during puberty. Ann Hum Biol. 1986;13(5):433–51.

    Article  CAS  PubMed  Google Scholar 

  46. Roy SM, Spivack JG, Faith MS, et al. Infant BMI or weight-for-length and obesity risk in early childhood. Pediatrics. 2016–05–01 00:00:00 2016;137(5).

  47. Rolland-Cachera MF. Childhood obesity: current definitions and recommendations for their use. Int J Pediatr Obes. 2011;6(5–6):325–31.

    Article  PubMed  Google Scholar 

  48. Kuczmarski RJ, Ogden CL, Guo SS, et al. 2000 CDC Growth Charts for the United States: methods and development. Vital Health Stat 11. 2002;246:1–190.

    Google Scholar 

  49. Freedman DS, Davies AJG, Kompaniyets L, et al. A longitudinal comparison of alternatives to body mass index Z-scores for children with very high body mass indexes. J Pediatr. 2021;235:156–62.

    Article  PubMed  Google Scholar 

  50. Flegal KM, Wei R, Ogden CL, Freedman DS, Johnson CL, Curtin LR. Characterizing extreme values of body mass index-for-age by using the 2000 Centers for Disease Control and Prevention growth charts. Am J Clin Nutr. 2009;90(5):1314–20.

    Article  CAS  PubMed  Google Scholar 

  51. Bilheimer LT, Sisk JE. Collecting adequate data on racial and ethnic disparities in health: the challenges continue. Health Aff (Millwood). 2008;27(2):383–91.

    Article  PubMed  Google Scholar 

  52. Pine M, Kowlessar NM, Salemi JL, et al. Enhancing clinical content and race/ethnicity data in statewide hospital administrative databases: obstacles encountered, strategies adopted, and lessons learned. Health Serv Res. 2015;50(Suppl 1):1300–21.

    Article  PubMed  PubMed Central  Google Scholar 

  53. Flanagin A, Frey T, Christiansen SL, Committee AMAMoS. Updated guidance on the reporting of race and ethnicity in medical and science journals. JAMA. 2021;326(7):621–7.

    Article  PubMed  Google Scholar 

  54. Flores G. Language barriers and hospitalized children: are we overlooking the most important risk factor for adverse events? JAMA Pediatr. 2020;174(12):e203238.

    Article  PubMed  Google Scholar 

  55. Boyd RW LE, Weeks LD, McLemore MR. On racism: a new standard for publishing on racial health inequities. Health Aff Blog. 2020.

  56. Stanczyk AB. The dynamics of U.S. household economic circumstances around a birth. Demography. 2020;57(4):1271–1296.

  57. Booman A, Stratton K, Vesco KK, et al. Insurance coverage and discontinuity during pregnancy: Frequency and associations documented in the PROMISE cohort. Health Serv Res. 2023.

    Article  PubMed  Google Scholar 

  58. Stage 2 Eligible Professional Meaningful Use Core Measures Measure 5 of 17. May 11, 2023. Accessed 11 May 2023.

  59. Oken E, Kleinman KP, Belfort MB, Hammitt JK, Gillman MW. Associations of gestational weight gain with short- and longer-term maternal and child health outcomes. Am J Epidemiol. 2009;170(2):173–80.

    Article  PubMed  PubMed Central  Google Scholar 

  60. Starling AP, Brinton JT, Glueck DH, et al. Associations of maternal BMI and gestational weight gain with neonatal adiposity in the Healthy Start study. Am J Clin Nutr. 2015;101(2):302–9.

    Article  CAS  PubMed  Google Scholar 

  61. Deierlein AL, Siega-Riz AM, Herring AH, Adair LS, Daniels JL. Gestational weight gain and predicted changes in offspring anthropometrics between early infancy and 3 years. Pediatr Obes. 2012;7(2):134–42.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  62. Messito MJ, Katzow MW, Mendelsohn AL, Gross RS. Starting Early program impacts on feeding at infant 10 months age: a randomized controlled trial. Child Obes. 2020;16(S1):S4–13.

    Article  PubMed  Google Scholar 

  63. Deierlein AL, Messito MJ, Katzow M, Berube LT, Dolin CD, Gross RS. Total and trimester-specific gestational weight gain and infant anthropometric outcomes at birth and 6 months in low-income Hispanic families. Pediatr Obes. 2020;15(3):e12589.

    Article  PubMed  Google Scholar 

  64. Pugh SJ, Ortega-Villa AM, Grobman W, et al. Longitudinal changes in maternal anthropometry in relation to neonatal anthropometry. Public Health Nutr. 2019;22(5):797–804.

    Article  PubMed  PubMed Central  Google Scholar 

  65. Klebanoff MA. The Collaborative Perinatal Project: a 50-year retrospective. Paediatr Perinat Epidemiol. 2009;23(1):2–8.

    Article  PubMed  PubMed Central  Google Scholar 

  66. Oza-Frank R, Keim SA. Should obese women gain less weight in pregnancy than recommended? Birth. 2013;40(2):107–14.

    Article  PubMed  Google Scholar 

  67. Suarez EA, Haug N, Hansbury A, Stojanovic D, Corey C. Prescription medication use and baseline health status of women with live-birth deliveries in a national data network. Am J Obstet Gynecol MFM. 2022;4(1):100512.

    Article  PubMed  Google Scholar 

  68. Milic NM, Codsi E, Butler Tobah YS, et al. Electronic Algorithm Is Superior to Hospital Discharge Codes for Diagnoses of Hypertensive Disorders of Pregnancy in Historical Cohorts. Mayo Clin Proc. 2018;93(12):1707–19.

    Article  PubMed  Google Scholar 

Download references


This work was conducted with the Accelerating Data Value Across a National Community Health Center Network (ADVANCE) Clinical Research Network (CRN). ADVANCE is led by OCHIN in partnership with Health Choice Network, Fenway Health, and Oregon Health & Science University.


ADVANCE is funded through the Patient-Centered Outcomes Research Institute (PCORI), contract number RI-OCHIN-01-MC. This work was supported by the National Institute of Diabetes and Digestive and Kidney Diseases (R01DK118484). The funding bodies had no involvement in the design and collection, analysis, or interpretation of the data or in writing the manuscript.

Author information

Authors and Affiliations



J.B.H. drafted manuscript and conceptualized and oversaw study design, data analysis, and interpretation. K.L.S. led, conducted, and documented identification of pregnancies and data management and extraction of data from the OCHIN electronic health record. R.S. conducted the data analysis. T.S. supervised the OCHIN data team and contributed to the identification of pregnancies and data management and extraction of data from the OCHIN electronic health record. K.K.V. was a major contributor to the study design and interpretation of results. A.B., K.S. and S.T. were major contributors to the development of study variables. D.D., S.L. and J.O. were major contributors to the data analysis. B.A.F., S.P.F. and J.M.S. contributed to the study design and interpretation of results. J.H. contributed to the OCHIN data management and extraction. A.P. managed research execution. All authors reviewed and edited manuscript drafts and read and approved the final manuscript.

Corresponding author

Correspondence to Janne Boone-Heinonen.

Ethics declarations

Ethics approval and consent to participate

The PROMISE study was approved by the Institutional Review Board at Oregon Health & Science University (Study 00020810), the lead site for the study. This study involved analysis of existing data and was granted a waiver of informed consent and authorization by the Institutional Review Board at Oregon Health & Science University. All study procedures were carried out in accordance with federal and institutional privacy and security guidelines and regulations.

Consent for publication

Not applicable.

Competing interests

Dr. Vesco reports that her institution received funding from Pfizer Independent Grants for Learning and Change for an unrelated study. All the other authors declare no competing interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Boone-Heinonen, J., Lyon-Scott, K., Springer, R. et al. Pregnancy health in a multi-state U.S. population of systemically underserved patients and their children: PROMISE cohort design and baseline characteristics. BMC Public Health 24, 886 (2024).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: