Skip to main content

Global measurement of intimate partner violence to monitor Sustainable Development Goal 5



One third of women experience intimate partner violence (IPV) and potential sequelae. Sustainable Development Goal (SDG) 5.2—to eliminate violence against women, including IPV—compels states to monitor such violence. We conducted the first global measurement-invariance assessment of standardised item sets for IPV.


Demographic and Health Surveys (DHS) from 36 Lower−/Middle-Income Countries (LMICs) administering 18 IPV items during 2012–2018 were included. Analyses were performed separately for two items sets: lifetime physical IPV (seven items) and controlling behaviours (five items). We performed country-specific exploratory and confirmatory factor analyses (EFA/CFA). Datasets meeting benchmarks for acceptable item loadings and model-fit statistics were included in multiple-group CFA (MGCFA) to test for exact measurement invariance. Based on findings, alignment optimization (AO) was performed to assess approximate measurement invariance (< 25% of model parameters non-invariant). For each item set, national rankings based on AO-derived scores and on prevalence estimates were compared. AO-derived scores were correlated with type-specific IPV prevalences to assess correspondence.


National rates of physical IPV (5.6–50.5%) and controlling behavior (25.9–84.7%) varied. For each item set, item loadings and model-fit statistics were adequate in country-specific, unidimensional EFAs and CFAs. Both unidimensional constructs lacked exact invariance in MGCFA but achieved approximate invariance in AO analysis (12.3% of model parameters for physical IPV and 6.7% for controlling behaviour non-invariant). For both item sets, national rankings based on AO-derived scores were distributed similarly to rankings based on prevalence. However, estimates often were not significantly different cross-nationally, precluding national-level comparisons regardless of estimation strategy. Three physical-IPV items (slap, twist, choke) and two controlling-behaviour items (meet female friends; contact with family) warrant cognitive testing to improve their psychometric properties. Correlations of AO-derived scores for physical IPV (0.48–0.66) and controlling behaviours (0.49–0.87) with prevalences of lifetime physical, sexual, psychological IPV as well as controlling behaviour varied.


Seven DHS lifetime physical-IPV items and five DHS controlling-behaviour items were approximately invariant across 36 LMICs spanning five world regions, such that cross-national comparisons of factor means are reasonable. Measurement-invariance testing over time will inform their utility to monitor SDG5.2.1; cross-national, cross-time measurement-invariance testing of improved sexual and psychological IPV item-sets is needed.

Peer Review reports


Intimate partner violence (IPV)—or psychological, physical, and sexual violence and controlling behaviour perpetrated by a spouse or dating partner—is a global public-health problem. Approximately 27% (95% Confidence Interval [CI] 23–31%) of ever-partnered women 15–49 years have ever experienced physical and/or sexual IPV, with regional estimates ranging from 18 to 35% [1]. Adverse effects of IPV on women may include economic insecurity and physical-, mental-, behavioural-, sexual-, or reproductive-health conditions [2,3,4,5,6,7,8,9]. IPV compromises national economic development, costing an estimated 5% of world gross domestic product (GDP) and nearly 15% of GDP in Sub-Saharan Africa [10].

Given the health, social, and economic costs of IPV, United Nations’ bodies, treaties, and declarations have called for better statistics on the nature, prevalence, causes, and consequences of violence against women as a basis for its elimination [11]. This pressure led, in 2015, to Sustainable Development Goal (SDG) 5.2, which urges governments to “eliminate all forms of violence against all women and girls in public and private...” [12]. Widespread endorsement of SDG5.2 compels national governments to measure and to report rates of violence against women, including IPV (SDG5.2.1).

The decades leading up to SDG5.2 saw marked growth in the number of IPV prevalence surveys. These surveys relied on diverse scales and data-collection approaches [13], from small-scale, localised research, to large multi-country studies [14, 15], and ongoing surveillance of IPV in multipurpose national surveys. No gold standard exists for data collection on IPV, but the Centers for Disease Control and Prevention [16], World Health Organization (WHO) [17], and Demographic and Health Surveys (DHS) [18] have agreed best practices. These practices include direct inquiry about acts experienced within a clear timeframe; the use of multiple, behaviourally-specific questions to capture reported experiences of specific acts of IPV; reliance on appropriately trained interviewers; and support for respondents and interviewers [11].

The DHS domestic violence module (DVM) is the most commonly administered module that follows these best practices to measure IPV at the national level in lower- and middle-income countries (LMICs). The DHS is a flagship project of the United States Agency for International Development (USAID), which has invested several hundred million dollars in data collection since 1984 [19] and is a critical source of population and health data for LMICs [20]. The DHS DVM is optional; however, by the end of 2020, 65 countries had administered it at least once, and 39 countries had administered it more than once (range: 1–9 times) [18], documenting large differences in national IPV prevalence [1].

While the DHS is used to inform policies, prevention efforts, and response interventions, the DHS DVM has not undergone a rigorous psychometric assessment. It, therefore, is unknown whether questions in the module are measurement invariant across countries on a global scale, a critical precondition for national comparisons. Research by members of this team on DHS questions about the acceptability of IPV showed modest non-comparability of prevalence estimates across countries due to module-design factors, such as slight differences across surveys in the number, wordings, and introductory framing of the questions [21]. If not identified and accounted for, areas of non-comparability may distort estimated differences in national IPV prevalence [21], with potential implications for national policies and the allocation of resources for prevention and response [22]. Addressing this knowledge gap now is critical, since the number of countries monitoring IPV will only increase with SDG5.2.

The objective of this paper is to perform the first comprehensive, global psychometric assessment of items developed to measure IPV in the DHS DVM. Using 36 national surveys conducted in LMICs during 2012–2018, we focused our main analysis on the item sets designed to measure lifetime physical IPV (seven items) and controlling behaviors (five items). The larger numbers of items in both sets made them more likely to be content valid, and violence researchers consider the physical IPV items to be more behaviourally specific and reliable than the psychological or sexual IPV items [23]. Our use of data from the DHS—the most geographically diverse source for nationally-representative data on IPV using similarly worded questions—enables us to make evidenced-based recommendations that are global in scope across LMICs. Our findings inform next steps in a global research agenda to improve measures of IPV to monitor SDG5.2.1.


Eligibility and sample

The DHS are multipurpose surveys administered to large, nationally-representative samples of households and randomly selected women of reproductive age (typically 15–49 years) in interviewed households. The DHS routinely collect data on women’s and children’s health. They use internationally recognised guidelines for survey methodology and for the ethical collection of data, including data on violence against women and girls (VAWG) [15, 24].

Eligible countries had completed a DHS between 2012 and 2018 (inclusive) and had administered the same 18 items measuring physical, sexual, or psychological IPV and controlling behaviours. Based on these criteria, the sample for this analysis included 36 DHS conducted in 36 LMICs (according to the World Bank classification system) and spanning five world regions.

Included DHS represented countries in Sub-Saharan Africa (22 countries), followed by countries in South and Southeast Asia (nine countries), Central Asia (two countries), North Africa/West Asia (two countries), and finally Latin America and the Caribbean (one country) (Table 1). Although a select sample, included DHS were conducted in demographically diverse national populations. For example, countries in the sample ranged widely in population size, from 516,000 people in the Maldives in its survey year to 1.35 billion in India in its survey year. Countries in the sample also ranged widely on the GINI index of income inequality retrieved for 2009–2018, from lower income inequality in the Kyrgyz Republic (GINI = 27.4) to higher inequality in Namibia (GINI = 59.1). Countries also ranged in gross national income per capita for 2018, from USD280 in Burundi to USD9310 in the Maldives, and in median grades of schooling completed for women of reproductive age in each DHS, from 3.0 in Nepal to 10.7 in the Philippines. Gender differences in the law, measured using the World Bank index on Women, Business, and the Law, ranged from 28.8 for Afghanistan to 86.9 for Zimbabwe, with higher scores indicating greater gender parity under the law (Supplemental Table S1). Finally, basic survey conditions varied somewhat across included DHS. The average survey team size ranged from 3 to 10 members. The number of training days ranged from 19 to 42, and the average interview duration ranged from 20 to 90 min, with a majority of DHS reporting an average duration of 30–60 min.

Table 1 Characteristics of included countries and Demographic and Health Surveys, N = 36 surveys across 36 countries 2012–2018

Data on IPV

The IPV-related questions in the DHS DVM [18] originated from the Revised Conflict Tactics Scales [15], a standardised instrument designed to capture behaviourally based acts of IPV ranging in severity from jealousy or anger for talking to other mean, to pushing or shoving, to the threat or actual use of a weapon. The DHS DVM has evolved to resemble more closely the instrument used by the WHO [17]. Specifically, the module includes three items to assess acts of psychological IPV, seven items to assess acts of physical IPV, three items to assess acts of sexual IPV, and five items to assess acts of male controlling behaviour. The occurrence of IPV is measured as the woman’s self-report of experiencing each IPV item: 1) ever in the lifetime of her referent relationship, and if yes, 2) with a standardised frequency in the 12 months before interview. Women’s reported experience of five controlling behaviours is measured without a specific timeframe or frequency. All items assess IPV in relation to the woman’s most recent spouse or partner. Supplemental Table S2 provides standard item wordings in English for each IPV item. Initial data exploration suggested that fewer than 2% of women in any included DHS sample had missing data on any single IPV item, and only 0.02% of all women (n = 65) across all 36 DHS had missing data on all IPV items.

Statistical analysis

We used Stata [25] for data processing and descriptive analyses and Mplus [26] for all other analyses. The main statistical analyses involved four major steps. As a first step, we conducted descriptive analyses to understand country-specific missingness and prevalence for each IPV item and item-specific prevalence ranges across included countries. As a second step, we performed 36 country-specific factor analyses to explore and then to confirm dimensionality of each IPV item set, the magnitudes of item loadings, and overall model fit. For each country, the exploratory factor analysis (EFA) model was considered adequate if: item loadings were 0.35 or greater; model fit statistics met recommended benchmarks (the root mean square error of approximation (RMSEA) was about 0.08 or lower, and the comparative fit index (CFI) and Tucker-Lewis index (TLI) were about 0.95 or higher); and the results fit with theory [27]. We then conducted country-specific confirmatory factor analyses (CFA), including countries that met the above-mentioned model-fit criteria in the EFA. We used the same criteria for the item loadings and model-fit statistics to assess the adequacy of the fit of all CFA models. The EFA and CFA used the means and variance-adjusted weighted least squares estimators, which were appropriate for dichotomous responses (1 = [ever] experienced, 0 = did not [ever] experience the IPV item). The approach used pairwise deletion to handle missing data [28].

As a third step, for national datasets that exhibited adequacy with respect to item loadings and model-fit statistics, we considered two approaches to assess the cross-national measurement invariance of the models confirmed in country-specific CFAs. Initially, we performed multiple-group CFA (MGCFA) to test for exact measurement invariance. When using this approach, small measurement differences are assumed to be exactly zero [29]. Following this approach, we tested sequentially for configural invariance, or equivalence of the factor structure across countries; then metric invariance, or equivalence of the factor loadings across countries; and then scalar invariance, or equivalence of the factor loadings and thresholds (or intercepts) across countries. Configural invariance implies that the dimensional structure of the latent IPV factor is equivalent across countries, although the item loadings and intercepts are free to vary across countries; whereas configural non-invariance implies that the latent IPV factor has a different dimensional structure across countries. Metric invariance implies that each IPV item contributes to the latent IPV construct to a similar degree across countries. Conversely, metric non-invariance implies that at least one IPV item is related differently to the latent IPV construct across countries. Scalar invariance implies that the factorial scores are comparable across countries. Conversely, scalar non-invariance may indicate potential measurement bias and suggests that larger forces, such as cultural norms, may influence systematically how different populations respond to IPV items in ways that are unrelated to the latent IPV construct. We used Maximum Likelihood estimation, which is appropriate for dichotomous responses and allowed us to test separately for metric and scalar invariance [30].

In the exact invariance-testing framework, evidence of metric or scalar non-invariance leaves three analytical options: (1) investigate the source of the non-invariance by sequentially releasing or adding loading or intercept constraints and retesting the models until partial measurement invariance is achieved, (2) omit IPV items with non-invariant loadings or intercepts and retest the sequential invariance models, or (3) assume that the IPV construct is noninvariant and discontinue exact invariance testing. Given the large number of countries and small number of IPV items per set, we did not consider options (1) or (2) to be advisable.

Instead, as a fourth step, based on findings from the MGCFA, we used alignment optimization (AO) to assess approximate measurement invariance of the IPV items across countries. According to users of AO methods, the restriction of equal model parameters required by MGCFA may be overly strict, especially when many groups or time points are involved in the comparison (e.g., Davidov et al. [31]). The approximate measurement invariance approach allows, instead, for differences in these model parameters across groups by finding an optimal model with the minimal amount of measurement non-invariance. In the first step of AO [32], MGCFA was used to confirm cross-national configural invariance of the IPV factor model. In the second step of AO, if configural invariance was achieved, the factor means and variances of all but the reference group, which were fixed to 0 and 1, were estimated to minimise the total amount of non-invariance across all parameters. The quality of the alignment result, then, was determined by the percentage of loading and intercept parameters that displayed non-invariance. As a guide, a limit of 25% of non-invariant parameters or less indicated trustworthy results [33]. For higher percentages, a Monte Carlo simulation is advised to assess the quality of the results [33]. Monte Carlo simulations are based on the correlation between the population factor means and the estimated alignment factor means, computed over groups and averaged over replications. Correlations of at least 0.98 produce reliable factor means [33]. Like MGCFA, AO employed maximum likelihood estimation, which used all available data, assuming data were missing at random [28, 33].

At the initial stages of analysis, we attempted to follow the above steps including the following IPV item sets: (1) four item sets (physical IPV, sexual IPV, psychological IPV, controlling behaviors) to assess the invariance of a four-dimensional IPV model, (2) three item sets (physical IPV, sexual IPV, controlling behaviours) to assess the invariance of a three-dimensional IPV model, and (3) two item sets (physical IPV and either sexual IPV or controlling behaviors) to assess the invariance of a bidimensional IPV model. We encountered challenges completing all analytical steps for these models (Supplemental File S1), which we discuss in the Limitations section of the Discussion with recommendations for future research. To address these challenges, we applied the above analytical steps to assess the invariance of unidimensional IPV models for item sets that arguably were more behaviourally based and/or more content validity because they included more items [34]. So, the analyses presented in the body of this paper assessed separately the measurement invariance of the seven physical-IPV items and the five controlling-behaviour items.


Conventional prevalence estimates of IPV

Estimates for lifetime IPV were generally high but ranged widely across sample countries (Table 2). Reported lifetime experience of physical IPV ranged from 5.6% in Comoros to 50.5% in Afghanistan. Reported lifetime experience of sexual IPV ranged from 1.1% in Armenia to 25.5% in the DRC. Reported lifetime experience of psychological IPV ranged from 6.4% in Comoros to 50.8% in Afghanistan, and reported experiences of controlling behaviours ranged from 25.9% in Cambodia to 84.7% in Gabon.

Table 2 National (weighted) estimates for lifetime and prior-year intimate partner violence, 36 Demographic and Health Surveys across 36 countries (2012–2018)

Reported prior-year prevalences of IPV, by type, also are presented in Table 2. In general, the lower item-specific prevalences for prior-year IPV, by type, made invariance testing with these measures more difficult (Supplemental File S1; results available on request).

Results from country-specific exploratory and confirmatory factor analyses

Tables 3 and 4 present the results for country-specific EFAs and CFAs for lifetime physical IPV (Table 3) and for controlling behaviours (Table 4) for all 36 DHS samples. For lifetime physical IPV, across all countries, all loadings exceeded 0.55 in the country-specific EFAs and exceeded 0.65 in the country-specific CFAs, above the 0.35 recommended benchmark. Moreover, all model-fit statistics (RMSEA, CFI, TLI) were within recommended benchmarks (Table 3). For controlling behaviours, across all countries, all loadings exceeded 0.50 in the country-specific EFAs and exceeded 0.40 in the country-specific CFAs, above the 0.35 recommended benchmark. Moreover, in almost all cases, model-fit statistics (RMSEA, CFI, TLI) were within recommended benchmarks (Table 4). Thus, in country-specific EFAs and CFAs, unidimensional models for the seven physical-IPV items and the five controlling-behaviour items had reasonable fits with the data across all countries. The country-specific loadings for each item and the ranges of estimated item loadings across countries are reported in Supplemental Tables S3a and S3b.

Table 3 Results of country-specific factor analyses and alignment optimization cross-country measurement invariance analysis, seven lifetime physical intimate partner violence items, N = 36 Demographic and Health Surveys across 36 countries (2012–2018)
Table 4 Results of country-specific factor analyses and alignment optimization cross-country measurement invariance analysis, five controlling behaviour items, N = 36 Demographic and Health Surveys across 36 countries (2012–2018)

Multiple-group CFA results: assessment of exact measurement invariance

Table 5 presents results for the MGCFAs for physical IPV (Panel 1) and controlling behaviours (Panel 2), across all 36 included countries. For the physical-IPV unidimensional model, the metric and configural models differed significantly (at p < 0.001), as did the scalar and metric models (at p < 0.001). Based on the test statistics and their proposed benchmarks, metric invariance across countries was not achieved. Similarly, for the controlling-behaviour unidimensional model, the metric and configural models differed significantly (at p < 0.001), as did the scalar and metric models (at p < 0.001). Based on the test statistics and their proposed benchmarks, metric invariance across countries was not achieved.

Table 5 Multiple-group confirmatory factor analysis, N = 136,693 across Demographic and Health Surveys in 36 countries, 2012–2018

Alignment optimization results: assessment of approximate measurement invariance

Given the lack of exact measurement invariance based on the MGCFA results, Table 6 presents the results based on alignment optimization, in which we assessed approximate measurement invariance separately for the physical-IPV items (Panel 1) and the controlling-behaviour items (Panel 2). For physical IPV, 55 (or 21.8% of) estimated thresholds, eight (or 2.8% of) estimated loadings, and 12.3% of all parameter estimates were measurement non-invariant (Table 3). The items ‘slap’, ‘choke’, and ‘twist’ had a low degree of threshold invariance, and the item ‘choke’ had a low degree of loading invariance (see low R2 values Table 6, Panel 1). For controlling behaviours, 21 (or 11.7% of) estimated thresholds, three (or 1.7% of) estimated loadings, and 6.7% of all parameter estimates were measurement non-invariant (Table 4). All items had a reasonable degree of threshold invariance; however, the items ‘meet your female friends’ and ‘contact with your family’ had a low degree of loading invariance (see low R2 values in Table 6, Panel 2). Again, a guideline of 25% or fewer total non-invariant parameter estimates is recommended for trustworthy latent mean estimates and their comparison across groups. Overall, results suggested that the DHS item sets for physical IPV and controlling behaviours exhibited approximate measurement invariance across the 36 countries and allowed acceptable alignment performance.

Table 6 Results from alignment optimization analysis, N = 136,693 across Demographic and Health Surveys in 36 countries, 2012–2018

Country rankings on level of physical IPV based on AO-estimates and standard prevalence

For illustration, Fig. 1 compares country rankings on level of lifetime physical IPV based on AO-derived scores versus conventional prevalence estimates. (Full country-ranking results for physical-IPV and analogous results for controlling behaviours are available on request.) The physical IPV scores are factor means derived from the final AO factor model, which presumes that observed items reflect a latent physical IPV construct. The prevalence estimates are based on aggregates of the observed responses to physical IPV items using mean estimation with adjustment for sampling. Uncertainties in both sets of estimates are reflected in 99.9% confidence intervals to account for multiple comparisons. As shown in Fig. 1, the distributions of country rankings based on AO-derived scores and prevalence estimates suggested some country-level differences; however, a Wilcoxon matched-pairs sign-rank test supported no significant difference in country rankings. Both sets of estimates exhibited a high degree of clustering. For example, in comparing countries using AO-derived scores, 12 clusters emerged, wherein country estimates did not differ significantly from one another. In comparing countries by conventional estimates of prevalence and associated confidence limits, three major clusters emerged: countries ranked 1–12, those ranked 13–30, and those ranked 31–36.

Fig. 1
figure 1

Levels of physical IPV derived from the alignment optimization approach and conventional prevalence estimation and associated country rankings, N = 36 Demographic and Health Surveys for 36 countries from 2012 to 2018

Convergent validity of AO-derived scores for physical IPV and controlling behaviour with IPV prevalences

As expected, AO-derived scores for physical IPV and for controlling behaviours were positively correlated with prevalence estimates for all four types of IPV, providing evidence for convergent validity. Pairwise correlations for physical IPV ranged from 0.48 to 0.66, and those for controlling behaviour ranged from 0.49 to 0.87. Pairwise scatter plots provided empirical support for linear relationships (Supplemental Fig. S1).


Summary of findings

This is the first cross-national analysis to assess the measurement invariance of seven standard physical-IPV items and five standard controlling-behaviour items from the DVM administered in 36 DHS across 36 LMICs during 2012–2018. Included countries spanned five world regions and had populations that varied in size, schooling attainment for women, income inequality, and degree of gender equity in the national legal environment. Elements of survey administration related to team size, number of training days, and average interview duration also differed across countries.

In separate (unidimensional) analyses, both item sets exhibited good country-specific measurement properties for all 36 LMICs. Although neither item set met the criteria for metric or scalar invariance, both item sets did meet the criteria for approximate invariance across all 36 LMICs. The distributions of country rankings, based on AO-derived scores and conventional prevalence estimates, were similar for physical IPV and for controlling behaviours. However, both AO-derived scores and prevalence estimates often were highly clustered and not significantly different, suggesting that individual country rankings were not interpretable using either set of estimates for either type of IPV.

Limitations and strengths

Findings are limited to the seven physical-IPV items and five controlling-behaviour items included in this analysis. As such, findings cannot be extrapolated to different physical IPV items, different controlling-behaviour items, or other item sets intended to capture other types of IPV. This limitation is important, given the challenges we encountered when attempting to undertake the same analytical steps for other combinations of IPV items sets (Supplemental File S1). These analytical challenges may be attributable to a variety of issues. First, the conceptualizations of psychological IPV [34,35,36] and sexual IPV [34] remain under-developed, especially in research undertaken in LMIC settings. Second, the sexual IPV and psychological IPV items sets used in the DHS each included only three items capturing a narrow subset of behaviors. The sexual IPV items, for example, captured only “forced” sex acts and excluded acts that occur when the victim is unable to consent [34]. Small item sets that lack content validity may miss acts that contribute importantly to the latent construct across countries. Third, low item-specific prevalences (especially for sexual IPV) have been noted as a concern for efforts to validate measures of IPV [37]. In our case, low item prevalences prevented model convergence during the Monte Carlo simulation stage of some AO analyses. Underestimates of IPV present ongoing challenges to the accurate measurement of IPV in LMICs [38]. For the DHS DVM, some of this low prevalence may have arisen because the DVM is implemented at the end of a sometimes long, multi-purpose survey (Table 1), when respondents and interviewers may be fatigued. Fourth, the less behaviourally-based and more subjective nature of the sexual IPV items (e.g., “physically forced”) and psychological IPV items (“humiliated”) may be a source of non-invariance, as such wording may be interpreted differently across languages and contexts. Finally, some differences in survey administration across countries in this analysis (e.g., team size; training duration; interview duration) may have contributed to our inability to establish exact invariance for items in the analysis. The DVM typically is administered at or toward the end of the women’s interview; therefore, the inclusion of more, sensitive, or different modules earlier in the interview may have framed the DVM in ways that affected its measurement invariance across countries.

Findings also are limited to this non-representative set of LMICs and for the period of analysis (2012–2018). Nevertheless, the establishment of approximate measurement invariance for seven physical-IPV items and five controlling-behaviour items across highly diverse LMICs spanning five world regions suggests the utility of these item sets to compare countries on these dimensions of IPV. These results support their use to monitor SDG5.2.1.

Implications for research and policy

Findings from this analysis have two major implications for future research and policy. First, we recommend that this analysis be replicated for high-income countries (HICs), LMICs in regions that are under-represented here, and surveys conducted before or after 2012–2018. Second, many of the estimates for physical IPV and controlling behaviours–whether derived from alignment optimization or based on standard prevalences–were not statistically different, when using a more conservative p-value (< 0.001) that took multiple comparisons into account. If national-level comparisons of estimates for physical IPV and controlling behaviours are a priority for monitoring SDG5.2.1, we recommend that such comparisons be based on estimates derived from larger national samples, which helps to reduce sampling error and to increase statistical power. By extension, any cross-time comparison of national IPV estimates between independent, repeated cross-sectional surveys may require larger sample sizes. Finally, if resource constraints do not allow sample surveys to be designed to reduce sampling error, we recommend that an international body like the World Health Organization consider convening an expert panel to deliberate the utility and policy relevance of establishing ranges for physical IPV and controlling behaviour that permit grouped comparisons.

Third, the physical IPV items ‘slap,’ ‘twist’, and ‘choke’ exhibited a low degree of intercept and/or loading invariance across countries. Cognitive testing of these items is recommended to improve their cross-national measurement equivalence. Likewise, the controlling-behaviour items ‘…meet your female friends’ and ‘…contact with your family’ exhibited a low degree of loading invariance across countries, and cognitive testing of these items also is recommended to improve their cross-national psychometric performance.

Fourth, further testing of these item sets for measurement invariance across repeated national surveys is needed to assess how invariant these item sets are over extended periods of time. Fifth, this analysis should be replicated for expanded psychological-IPV and sexual-IPV item sets, both currently only three items each. Until then, the seven physical-IPV items and the five controlling-behaviour items from the DHS DVM appear useful to measure and to compare countries on levels of IPV against women.


Alignment Optimization is a powerful approach to assess approximate measurement equivalence of IPV scales across widely diverse countries charged with monitoring SDG5.2. The seven physical-IPV items and the five controlling-behaviour items from the DHS DVM exhibit approximate measurement invariance across 36 diverse LMICs spanning five regions. If shown to be invariant over calendar time and across HICs, these item sets may be useful to monitor SDG5.2 globally.

Availability of data and materials

Data from the Demographic and Health Surveys (DHS) are publicly available upon reasonable request to Measure DHS: Investigators must request from Measure DHS access to the data used here.



Intimate Partner Violence


Confidence Interval


Gross Domestic Product


Sustainable Development Goal


World Health Organization


Demographic and Health Surveys


Domestic Violence Module


Lower- and Middle-Income Country


Violence Against Women and Girls


United States Dollar


Exploratory Factor Analysis


Root Mean Square Error of Approximation


Comparative Fit Index


Tucker-Lewis Index


Confirmatory Factor Analysis


Multiple-Group Confirmatory Factor Analysis


Alignment Optimization


Democratic Republic of the Congo


High-Income Country


  1. World Health Organization on behalf of the United Nations Inter-Agency Working Group on Violence Against Women Estimation and Data (UNICEF, U., UNODC, UNSD, UNWomen), Violence against women prevalence estimates 2018. Global, regional and national prevalence estimates for intimate partner violence against women and global and regional prevalence estimates for non-partner sexual violence against women. Geneva: World Health Organization; 2021.

    Google Scholar 

  2. Devries KM, et al. Intimate partner violence and incident depressive symptoms and suicide attempts: a systematic review of longitudinal studies. PLoS Med. 2013;10(5):e1001439.

    Article  Google Scholar 

  3. Devries KM, et al. Intimate partner violence victimization and alcohol consumption in women: a systematic review and meta-analysis. Addiction. 2014;109(3):379–91.

    Article  Google Scholar 

  4. Dillon G, et al. Mental and physical health and intimate partner violence against women: a review of the literature. Int J Family Med. 2013;2013:313909.

    Article  Google Scholar 

  5. Crane CA, Hawes SW, Weinberger AH. Intimate partner violence victimization and cigarette smoking: a meta-analytic review. Trauma Violence Abuse. 2013;14(4):305–15.

    Article  Google Scholar 

  6. Beydoun HA, et al. Intimate partner violence against adult women and its association with major depressive disorder, depressive symptoms and postpartum depression: a systematic review and meta-analysis. Soc Sci Med. 2012;75(6):959–75.

    Article  Google Scholar 

  7. Maxwell L, et al. Estimating the effect of intimate partner violence on women's use of contraception: a systematic review and meta-analysis. PLoS One. 2015;10(2):e0118234.

    Article  Google Scholar 

  8. Yount KM. Resources, family organization, and domestic violence against married women in Minya. Egypt J Marriage Fam. 2005;67(3):579–96.

    Article  Google Scholar 

  9. Potter LC, et al. Categories and health impacts of intimate partner violence in the World Health Organization multi-country study on women’s health and domestic violence. Int J Epidemiol. 2020;50(2):652–62.

    Article  Google Scholar 

  10. Hoeffler A, Fearon J. Benefits and costs of the conflict and violence targets for the post-2015 development agenda, in Post-2015 consensus, conflict and violence assessment paper. Copenhagen: Copenhagen Consensus Center; 2014.

    Google Scholar 

  11. United Nations Department of Economic and Social Affairs Statistics Division. Guidelines for producing statistics on violence against women-statistical surveys. New York: United Nations; 2014.

    Google Scholar 

  12. United Nations. Transforming our world: the 2030 agenda for sustainable development. New York: United Nations; 2015.

    Google Scholar 

  13. Devries KM, et al. Global health. The global prevalence of intimate partner violence against women. Science. 2013;340(6140):1527–8.

    CAS  Article  Google Scholar 

  14. Garcia-Moreno C, et al. Prevalence of intimate partner violence: findings from the WHO multi-country study on women's health and domestic violence. Lancet. 2006;368(9543):1260–9.

    Article  Google Scholar 

  15. Kishor S, Johnson K. Profiling domestic violence: a multi-country study. Calverton: ORC Macro; 2004.

    Google Scholar 

  16. Breiding MJ, et al. Intimate partner violence surveillance: uniform definitions and recommended data elements, version 2.0. Atlanta: National Center for Injury Prevention and Control, Centers for Disease Control and Prevention; 2015.

    Google Scholar 

  17. World Health Organization. WHO multi-country study on women’s health and domestic violence against women: summary report of initial results onprevalence, health outcomes and women’s responses. Geneva: World Health Organization; 2005.

    Google Scholar 

  18. MEASURE DHS and ICF International. Domestic violence module: demographic and health surveys methodology. Calverton: MEASURE DHS/ICF International; 2014.

    Google Scholar 

  19. Short Fabic M, Choi Y, Bird S. A systematic review of demographic and health surveys: data availability and utilization for research. Bull World Health Organ. 2012;90(8):604–12.

    Article  Google Scholar 

  20. Hancioglu A, Arnold F. Measuring coverage in MNCH: tracking progress in health for women and children using DHS and MICS household surveys. PLoS Med. 2013;10(5):e1001391.

    Article  Google Scholar 

  21. Yount KM, et al. Response effects to attitudinal questions about domestic violence against women: a comparative perspective. Soc Sci Res. 2011;40:873–84.

    Article  Google Scholar 

  22. Guenole N, Brown A. The consequences of ignoring measurement invariance for path coefficients in structural equation models. Front Psychol. 2014;5:980.

    Article  Google Scholar 

  23. Costa D, Barros H. Instruments to assess intimate partner violence: a scoping review of the literature. Violence Vict. 2016;31(4):591–621.

    Article  Google Scholar 

  24. Ellsberg M, Heise L. Bearing witness: ethics in domestic violence research. Lancet. 2002;359(9317):1599–604.

    Article  Google Scholar 

  25. StataCorp. Stata statisical software: release 16. College Station: StatCorp LP; 2019.

    Google Scholar 

  26. Muthén LK, Muthén BO. Mplus user's guide. 8th ed. null. 1998-2017. Los Angeles: Muthén & Muthén.

  27. Hu L, Bentler PM. Cutoff criteria for fit indexes in covariance structure analysis: conventional criteria versus new alternatives. Struct Equ Model Multidiscip J. 1999;6(1):1–55.

    Article  Google Scholar 

  28. Brown TA. Confirmatory factor analysis for applied research. London: The Guilford Press; 2006.

    Google Scholar 

  29. Vandenberg RJ, Lance CE. A review and synthesis of the measurement invariance literature: suggestions, practices, and recommendations for organizational research. Organ Res. 2000;3(1):4–70.

  30. Muthén B, Asparouhov T. IRT studies of many groups: the alignment method. Front Psychol. 2014;5:1–7.

    Google Scholar 

  31. Davidov E, et al. The comparability of measurements of attitudes toward immigration in the European social survey: exact versus approximate measurement equivalence. Public Opin Q. 2015;79(S1):244–66.

  32. Muthén B, Asparouhov T. Recent methods for the study of measurement invariance with many groups: alignment and random effects. Sociol Methods Res. 2018;47:637–64.

    Article  Google Scholar 

  33. Asparouhov T, Muthén B. Multiple-group factor analysis alignment. Struct Equ Model Multidiscip J. 2014;21(4):495–508.

    Article  Google Scholar 

  34. Follingstad DR, Rogers MJJSR. Validity concerns in the measurement of women’s and men’s report of intimate partner violence. Sex Roles. 2013;69(3–4):149–67.

  35. Martín-Fernández M, Gracia E, Lila MJBPH. Psychological intimate partner violence against women in the European Union: a cross-national invariance study. BMC Public Health. 2019;19(1):1–11.

  36. Porrúa-García C, et al. Development and validation of the scale of psychological abuse in intimate partner violence (EAPA-P). Psicothema. 2016;28(2):214–21.

  37. Ryan KMJSr. Issues of reliability in measuring intimate partner violence during courtship. Sex Roles. 2013;69(3–4):131–48.

  38. Palermo T, Bleck J, Peterman AJAjoe. Tip of the iceberg: reporting and gender-based violence in developing countries. Am J Epidemiol. 2014;179(5):602–12.

Download references


The authors thank our advisory board (Kristin Dunkle, Claudia Garcia-Moreno, Andrew Gibbs, Sunita Kishor, Rachel Jewkes, and Enrique Gracia) for comments on the analysis.


This work was supported by the Eunice Kennedy Shriver National Institute of Child Health and Human Development and the National Institute of Mental Health [R01HD099224] (PIs CJC, KMY). The funders played to role in the design of the study; in the collection, analysis, and interpretation of data; in writing the manuscript; or in the decision to submit for publication.

Author information

Authors and Affiliations



KMY, CJC, and YFC conceptualized the study and developed the methodology. KMY, YFC, and ZK analyzed and visualized the data. KMY and CJC acquired funding. KMY, CJC, ZK, and IB undertook project administration. KMY, CJC, YFC, and NK provided supervision. KMY, CJC, IB, YFC, ZK wrote the original draft. All authors reviewed and edited the manuscript. The author(s) read and approved the final manuscript.

Corresponding author

Correspondence to Kathryn M. Yount.

Ethics declarations

Ethics approval and consent to participate

This study was deemed exempt (not human subjects research) by the Emory University Institutional Review Board. All methods used to collect the original data were performed according to the Declaration of Helsinki.

Consent for publication

Not applicable.

Competing interests

The authors have no competing interests to declare.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1.

Monitoring SDG 5 Supplementary Materials.docx contains supplemental tables and figures referenced in the text.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Yount, K.M., Cheong, Y.F., Khan, Z. et al. Global measurement of intimate partner violence to monitor Sustainable Development Goal 5. BMC Public Health 22, 465 (2022).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI:


  • Alignment optimization
  • Controlling behaviours
  • Cross-national
  • Measurement invariance testing
  • Physical intimate partner violence
  • Sustainable development goal 5