Treating loss-to-follow-up as a missing data problem: a case study using a longitudinal cohort of HIV-infected patients in Haiti

Background HIV programs are often assessed by the proportion of patients who are alive and retained in care; however some patients are categorized as lost to follow-up (LTF) and have unknown vital status. LTF is not an outcome but a mixed category of patients who have undocumented death, transfer and disengagement from care. Estimating vital status (dead versus alive) among this category is critical for survival analyses and program evaluation. Methods We used three methods to estimate survival in the cohort and to ascertain factors associated with death among the first cohort of HIV positive patients to receive antiretroviral therapy in Haiti: complete case (CC) (drops missing), Inverse Probability Weights (IPW) (uses tracking data) and Multiple Imputation with Chained Equations (MICE) (imputes missing data). Logistic regression was used to calculate odds ratios and 95% confidence intervals for adjusted models for death at 10 years. The logistic regression models controlled for sex, age, severe poverty (living on <$1 USD per day), Port-au-Prince residence and baseline clinical characteristics of weight, CD4, WHO stage and tuberculosis diagnosis. Results Age, severe poverty, baseline weight and WHO stage were statistically significant predictors of AIDS related mortality across all models. Gender was only statistically significant in the MICE model but had at least a 10% difference in odds ratios across all models. Conclusion Each of these methods had different assumptions and differed in the number of observations included due to how missing values were addressed. We found MICE to be most robust in predicting survival status as it allowed us to impute missing data so that we had the maximum number of observations to perform regression analyses. MICE also provides a complementary alternative for estimating survival among patients with unassigned vital status. Additionally, the results were easier to interpret, less likely to be biased and provided an alternative to a problem that is often commented upon in the extant literature. Electronic supplementary material The online version of this article (10.1186/s12889-018-6115-0) contains supplementary material, which is available to authorized users.


Background
HIV programs are often assessed by the proportion of patients who are alive and retained in care, which has direct consequences for funding and programmatic services offered [1,2]. However, among individuals who initiate antiretroviral treatment (ART), the reported rate of lost to follow up ranges from 5 to 53% [1,[3][4][5][6][7][8][9][10]. Clinically, these LTF patients are at risk for adverse outcomes such as medication resistance, transmission to others, lack of care, or at best, incomplete medical records when they transfer care to another clinic [1,6,7]. Programmatically, lost to follow up leads to underestimates of retention which could be mis-interpreted as underperformance on program outcomes [1,5,6,11].
The category of lost to follow-up (LTF) is not a homogeneous outcome-e.g., "dead" or "alive"-but rather a heterogeneous category of three disparate health states: undocumented deaths, undocumented or silent transfers to another source of HIV care, or alive and complete disengagement from HIV care [12][13][14]. Alive and being retained in care is synonymous with the proportion of patients who are neither dead nor LTF. The fact that LTF is part of the definition makes this outcome complex and problematic.
In reality, LTF is a marker for missing data on vital status. We argue that LTF should not be treated as a legitimate outcome category because it's meaning can easily change over time and across sites. For example, patients who silently transfer to another provider, move domiciles or die outside of a healthcare facility could all be classified as LTF. Thus, studying predictors of LTF should be avoided. Instead, LTF should be considered a missing data problem that needs to be solved. We present a unique application of MICE to impute both missing outcome (vital status) and missing covariates, simultaneously, using a large longitudinal cohort of patients from Haiti who were treated for HIV infection, and compare the results with MICE to the more traditional analytic methods of using complete cases and inverse probability weights. We also evaluated associations that were predictive of death using three different methods: complete case, inverse probability weights and multiple imputation with chained equations.

Statistical methods for handling missing vital status
In the HIV literature, for studies assessing predictors of mortality/survival, the most common methods of dealing with LTF are complete case analysis, survival models that censor those LTF, and tracing with inverse probability weights [10,[15][16][17][18][19][20][21][22]. But there are other methods, including simple imputation, multiple imputation, and Bayesian analysis [15]. Each method has different underlying assumptions about the missing data.

Complete case analysis
Complete case analysis omits observations with missing data in multivariable analyses. It is the default method, employed automatically, of most statistical software programs. As only complete observations are used, sample size is decreased, statistical power is compromised, and study results are often biased [10,16].

Kaplan Meier survival analysis
Kaplan Meier analysis assumes that lost to follow up is unrelated to mortality. To state this another way, patients who are censored due to LTF have the same probability of survival as those who are not lost to follow up [23]. However, one cannot verify the Kaplan Meier assumption without more information. From the extant literature, studies have traced patients who are categorized as lost and found that between 12 and 87% were dead [24]. With this wide range in mind, it is impossible to say if LTF is associated with higher mortality, lower mortality, or if there is no association. Employing this method, patients who are LTF are censored at a time point typically defined by the date when vital status was last verified. It is often used for analyzing HIV cohort data because all cases can be included, at least for the duration that they were followed before being lost.

Inverse probability weights from tracing
Inverse probability weights (IPW) offer another general method for dealing with missing data [17][18][19][20][21]25]. In the HIV literature, they are often used in conjunction with tracing data. This approach involves using physical or contact tracing to determine the true vital status among a sample of those LTF [20][21][22]25]. Then, assuming this sample is representative of all LTF, tracing data is used to apply weights to the subjects with no missing outcome data, so that the weighted analysis provides less biased results, compared to the biased results when using (unweighted) complete cases. The results of the tracing are used to calculate the inverse probability of being a complete case (given the unique set of patient characteristics, including predictors and outcomes), which is used to weight each of the complete cases [20][21][22]25]. This method assumes that those who are unsuccessfully traced have a mortality that can be accurately estimated from those successfully traced.
For example, consider a simple analysis to assess whether gender predicts mortality. Among 100 women 50 are documented dead and 50 are documented alive, among 100 men there are 20 documented dead, 20 documented alive, and 60 LTF. A "complete case" analysis suggests that men and women have the same risk of dying (RR = 1), since 50% of the men died and 50% of the women died. However, suppose all 60 of the men LTF were successfully traced and found to be dead. For women who died, all were complete cases, so the IPW is the inverse of the probability of being a complete case, or 1/1.0, or 1. For all women who did not die, all were also complete cases, so the IPW is also 1/1.0. For men who were alive, all were complete cases, so their IPW is also 1/1.0. But for men who died (n = 80, 20 complete case deaths and 60 traced deaths), the probability of being a complete case was 20/(20 + 60), and therefore the IPW is 1/.25, or 4. If we apply these weights and do an IPW analysis-giving complete case men who died 4x the weight of any other complete case-then the average mortality among men is 20 × 4/(20 + 20 × 4) = 80%; and the risk of dying among men compared to women is 80/50 = 1.6.
Note: If only a fraction (f ) of the LTF get traced, then each of the traced cases is weighted by the inverse probability of being traced, that is, by 1/f. However the performance of the IPW model is dependent on methods used to track patients. In resource-limited settings, tracing is difficult, costly, and often unsuccessful. In our case study, Haiti does not have a unique national identification number for its citizens, making it difficult to track patients across various health systems or to verify vital status by referencing a current national death registry [3].

Multiple imputation with chained equations (MICE)
Multiple Imputation with Chained Equations (MICE) is a less commonly used method for estimating the vital status of those LTF. Although MICE is commonly used to impute missing covariate (predictor) data, [10,26,27] it can also be used to impute missing outcome data [26,27]. MICE is optimal when less than 30% of a variable's data are missing and when subjects with missing data are only randomly different ("missing at random") from those subjects who share an identical set of patient characteristics, or covariate values [28][29][30][31]. However, to our knowledge, no articles in the extant HIV literature have reported results after imputing both the outcome and covariates simultaneously.
The aim of this analysis is to present the application of MICE to impute both missing outcome (vital status) and missing covariates, simultaneously, using a large longitudinal cohort of patients from Haiti who were treated for HIV infection, and compare the results with MICE to the more traditional methods of using complete cases, survival analysis and inverse probability weights. Specifically, we compare adjusted logistic regression models for factors associated with death using complete case, IPW and MICE.

Study population
The study population is a cohort of 910 individuals age 13 years or older who initiated antiretroviral therapy (ART) for HIV according to international guidelines between March 2003 and April 2004 in Haiti [32,33]. The cohort was followed for ten years through 2015. Details of this cohort are described in previous publications [32,33].

Clinical measurements and outcomes
Clinical characteristics available from routinely documented data included body weight, CD4+ cell count (CD4), WHO stage, and diagnosis of tuberculosis. Sociodemographic data included age, sex, severe poverty, and residence within the city of Port au Prince. Severe poverty was defined as living on less than one United States dollar per day. Date of death and transfer were documented in the medical record. Lost to follow-up was defined as no documented death or transfer and no clinical visit or pharmacy pick-up during the last 180 days of the 10-year follow-up. Patients who were classified as LTF were traced by clinic staff at the time of their 10-year anniversary to ascertain vital status.

Missing data
The frequency of missing data at baseline was 3% for weight, 12% for CD4 count, and 12% for vital status at 10 years of follow-up. The 71 subjects who were documented to have transferred their care to another clinic (8%) were assumed to be alive at 10 years.

Multiple imputation with chained equations (MICE)
Data were assumed to be missing at random; i.e. considered only randomly different from other subjects that share the same pattern of values for the non-missing variables. MICE was used to impute all missing values, whether for missing covariates, such as CD4 count and weight, or for missing vital status (LTF) at 10 years of follow-up. We used Stata's implementation of MICE, which allows the imputation of various types of variables (categorical, ordinal, or continuous) in chained equations using a semi-Bayesian approach in [30] In this study, CD4 and baseline weight were continuous variables and vital status was a dichotomous variable. Results from multivariable fractional polynomial models on complete case data indicated that CD4 is best represented as a cubic function and baseline weight is best represented as a squared function. These transformations were included in the multiple imputation model. Equations were created to impute missing values and were composed of all variables used in the fully adjusted models [30]. Predictive mean matching using 5 nearest neighbors was used to impute CD4 and baseline weight [34][35][36][37]. Twenty imputations were computed based on current guidelines in the literature [30]. Various diagnostic measures were performed to check the fitness of the generated datasets. Specifically, proportions were calculated to assess imputed values of categorical variables and continuous variables were assessed using trace plots [38]. The Stata command midiagplots was used to assess the imputed datasets [38].

Classification and regression trees (CART)
Classification and regression trees were utilized to ascertain if any interaction should be incorporated into the multiple imputation [39]. Classification trees, in contrast to traditional statistical models, are especially useful for assessing for interactions when there are significant amounts of missing data [40]. After building the tree and pruning it using the R command cptable, no statistically significant interactions were found [41].

Statistical analysis Kaplan Meier
Survival estimates were calculated using Kaplan Meier analyses and a Kaplan Meier curve was generated. Time from enrollment to death or end of study censor (ten years after enrollment with a maximum date of June 26, 2014) was calculated. Participants who were classified as LTF were censored at their last visit to the clinic.

Inverse probability weights from tracing
In September 2013, staff attempted to contact all 156 patients who were classified as LTF, using telephone and home visits. Results of this tracing method were used to create inverse probability weights (IPW) that were applied to cases with similar covariates and known vital status.

Multiple imputation with chained equations
The mi suite of commands from STATA was used to perform analyses using the multiply imputed datasets. Stata's mi suite of commands follows Rubin's rules for the combination of results across imputed datasets [42].

Logistic regression
For each predictor (covariate), logistic regression models were created to calculate odds ratios and 95% confidence intervals for being dead after 10 years of follow-up (univariable models). Additionally, we created multivariable (fully adjusted) models that included all clinical and sociodemographic variables. Although age, weight, and CD4 count were measured as continuous variables, when reporting the results of the logistic regression models, we describe the effects of a 10-year age difference, 10-kg weight difference, and 100-cell difference in CD4 count.

Sensitivity analysis -Multiple imputation then deletion
As a sensitivity analysis, we performed multiple imputation of all missing data, followed by deletion of all cases of missing outcomes. In this method, both the outcome and covariates are imputed and after the datasets are created, observations where the outcome was imputed are deleted from the dataset running the same univariable and multivariable models [43]. This method has been reported to lead to more efficient estimates and narrower confidence intervals [43].
All analyses were performed using STATA version 13 and R version 3.4.2. Additional file 1.

Ethics and consent to participate
The institutional review boards at GHESKIO and at Weill Medical College of Cornell University approved this analysis.

Outcome tracing
Among the 156 patients who were categorized as LTF, the clinical team was able to trace and find 45 (29%). Of the 45 patients successfully traced, 37 (82%) were found to be alive and 8 (18%) had died prior to 10 years of follow-up. Based on the 18% risk of death among those successfully traced, we assume that 18% of the 156 LTF (n = 28) were dead at 10 years and the remainder were alive. Since the probability of being known alive at 10 years among all patients who were actually alive (known alive plus number estimated to be alive among LTF by the tracing method) is 0.79, then the IPW for all those subjects who are known alive is 1/0.79.

Missing data/ diagnostics of the multiple imputation
Convergence was achieved when MICE was performed. To assess the results of the multiple imputation, kernel density and trace plots were constructed. The kernel density plots for the imputed values of CD4 and weight are shown in Fig. 1 for the first 5 imputed datasets. The means and interquartile ranges for CD4 and weight are similar to the observed non-missing observations in the a b  (Table 1). Figure 2 displays the trace plots for the twenty imputed datasets. These plots show no discernable pattern, which is the result expected of a wellexecuted multiple imputation.

Predictors of death using complete case, IPW and MICE: A comparison
The weighted sample when using IPW weights from tracing had 111 fewer observations (N = 799) compared to the MICE dataset, which included all observations (N = 910), because any subject with missing covariate data was dropped. The complete case model should have the least number of observations (N = 735) because any case with any missing value was dropped from the analysis. Table 2 displays the logistic regression results for each individual predictor of death using three types of models: complete case (CC), inverse probability weighting (IPW) and multiple imputation with chained equations (MICE). Severe poverty was statistically significant across all models and the odds ratio had an approximate 20% difference between CC and IPW (CC OR = 1.78, IPW OR = 1.59, MICE OR = 1.74). WHO stage and baseline weight were statistically significant across all models and had similar odds ratios from the three methods (Table 2). CD4 had a similar point estimate across all 3 models (CC OR = 0.86, IPW OR = 0.86, MICE OR = 0.85). However, the point estimate was not statistically significant in the CC model. Age was slightly different across all three models (CC OR = 1.17, IPW OR = 1.26, MICE OR = 1.20). Similar to CD4, age was not statistically significant for CC. Baseline tuberculosis was statistically significant across all models and had a slight variation in the point estimates (CC OR = 1.97, IPW OR = 1.92, MICE OR = 1.98). Gender and residence were not statistically significant in any model.
Although in univariable analysis (single predictor), the beta coefficients have similar point estimates regardless of method; differences are seen among the point estimates in multivariable models. Table 3 displays results from multivariable logistic regression models using the three methods. Severe poverty was statistically significant across all models and the odds ratio had an approximate 10% difference between CC and MICE (CC OR = 1.63, IPW OR = 1.64, MICE OR = 1.80). Similarly, WHO stage was statistically significant across all models and had an approximate 15% difference between the odds ratios from the CC models (OR = 1.50) compared to the MICE model (OR = 1.76). Age and baseline weight were statistically significant across all the models with a slight variation in the point estimates and 95% confidence intervals. Female gender was found to be protective for death across all three models; however, it was statistically significant only in the MICE model (OR 0.62; 95% CI: 0.44-0.87) and there was about a 10% difference between the IPW and the MICE models' odds ratios. Baseline tuberculosis infection was associated with a higher odds of death across the three models, however it was only statistically significant in the complete case model (OR 1.83; 95% CI: 1.05-3.20). Additionally, there was an approximate 20% difference between the odds ratios of the CC and the MICE models for baseline tuberculosis infection. Port au Prince residence and CD4 were not statistically significant across the three models.

Sensitivity analysis
Results from the sensitivity analysis were very similar to the results from the MICE models for univariable and multivariable models. For severe poverty and baseline weight, with the multivariable model only, the 95%

Discussion
Among the first cohort of HIV patients who initiated antiretroviral therapy in Haiti from 2003 to 2014, we aimed to find associations that were predictive of death using three different methods: complete case, inverse probability weights and multiple imputation with chained equations.
These three procedures have different assumptions and differed in the number of observations included in the adjusted model due to how missing values for co-variates were addressed. Although the point estimates were similar across the three models, for statistically significant factors we found as much as a 20% difference in odds ratio values. For statistically significant factors, such as severe poverty and WHO stage, the odds ratios in the MICE models were farther away from the null compared to the CC and IPW models. Severe poverty was a statistically significant predictor of death in the MICE model (OR 1.80; 95% CI: 1.28-2.52). In a similar cohort from the same clinic in Haiti, income was associated with a higher odds of attrition (OR 1.65; 95% CI: 1.25-2.19) [45]. Additionally, these estimates are similar to those from an intensive contact tracing program performed in Malawi on HIV positive patients, which found about 70% of people who were initially categorized as LTF were alive and 30% were dead [13]. Worldwide, LTF rates for patients who have initiated ART treatment for at least one year range from 5 to 53% [1,[3][4][5][6][7][8][9][10]. Patient characteristics associated with becoming LTF include being clinically ill, as measured by CD4 count or WHO symptom staging, low socioeconomic status, and concern for stigma, as well as structural factors such as transportation issues [3,[7][8][9][10][45][46][47]. Several studies have reported high rates of re-engagement in care by patients who were previously labeled as LTF   [3,4,7,8,11,45]. A study in South Africa found that up to 50% of patients who disengaged from care will re-engage within 3 years including care received at a hospital or emergency department visit [7]. Contemporary studies that were able to determine the true status of LTF patients-which is a small number-most had transferred care to clinics closer to their home or newer clinics that provide different services; or alternatively, were alive and not engaged in care [3,4,7,11,45]. Forster et al. found a strong correlation between clinics with high LTF rates also had high rates of missing data for patient characteristics [1]. Ideally, a formal tracking system that "follows" patients when they receive care at other institutions would be an optimal way to track silent transfers; however this is still in development in most countries [3,4,7,10,48]. With these findings that most LTF patients are actually alive, our method of imputing LTF status and missing covariates, at the same time, is a cost effective method to estimate true mortality and to study risk factors for HIV.
Each of the described methods in this article has different assumptions for LTF, as well as limitations and strengths (Table 4). For complete case analysis, the loss of statistical power by automatically excluding observations that have missing information is a concern for many researchers [15,29]. This automatic exclusion leaves room for bias depending on the types and patterns of missingness [28,29]. Many HIV studies have found that the underlying assumption that LTF is unrelated to mortality is an incorrect assumption and thus survival estimates and associations of death to be biased and incorrectly estimated [17,21,25,49]. Clinicians report that those who were LTF back in the early 2000's were later found to be dead compared to more contemporary cohorts whose LTF participants are more likely to be alive [13,22,25,50,51].
With regards to IPW from tracing data, there are many limitations associated with this methodology. IPW from tracing techniques assume that the traced participants are a representative sample of all LTF. With this assumption in mind, a random sample of LTF participants is selected for tracing [13,20,21,[52][53][54][55]. In this cohort, tracing was attempted on all participants who were LTF and was performed with telephone and in-person follow up. Additionally, in this cohort, tracing  was done at the end of the 10 year follow-up period, and those who were more recently lost were more likely to be found compared to those lost at the beginning of the follow-up period. Another limitation, inherent in most IPW analyses, is the non-inclusion of several observations because of automatic case-wise deletion by the analysis software due to missing data. With this in mind, estimates might be biased and a loss of statistical power might occur when utilizing this method [22,56]. Unlike IPW, MICE is able to use all the observations in a dataset by imputing the missing values, resulting in robust results. However, it too has assumptions and is prone to limitations. One major assumption is that the risk of death among patients who are LTF is constant over time. This may not be the case as mortality is known to be highest in early periods after ART initiation and decreases over time [33,34,45,57]. Additionally, MICE relies on a good prediction model and requires data to be missing at random (MAR) [29,31]. Although MAR is difficult to ascertain, recent publications have explored the application of MICE in non-MAR situations and found that a small amount of bias might be present in the results. However, compared to the other methods, the small amount of bias that might be present is offset by the gains of using all observations present in the dataset and the robust standard errors calculated by the procedure [29,30,58]. Several studies have incorporated MICE as a method to estimate associations due to attrition or lost to follow up in longitudinal studies [59,60]. Regardless of the method used, one must diligently explore patterns of missingness before performing any analyses [10,25,[28][29][30][31]. We believe that, despite some limitations with MICE, the benefits of using all available data and the subsequent calculation of robust standard errors outweigh the limitations. Therefore, the approach of imputing both the outcome and covariates seems better than more traditional methods.
Although we describe a statical approach to approximating survival rates, implementation research is needed to determine the effectiveness and scalability of interventions to keep patients engaged in care and to return them into care [3,44,45,48]. HIV programs should consider including sensitivity analyses or other methods for estimating the vital status among those categorized as lost, as traditional methods, such as CC, IPW, Kaplan Meier and Cox proportional hazards models,do not consider that patients who are lost re-engage in care. The multiple imputation method that we describe in this paper provides an estimate that is closer to the actual outcome rates. Further research is needed to test this method in other countries and HIV programs to see if it provides outcome estimates close to actual rates.

Conclusions
In the last ten years, there has been an increase in the number of journal articles citing multiple imputation as a method used for filling in missing values or as a secondary analysis [53,61,62]. MICE might be a cost efficient mathematical alternative that can be employed in resource limited settings such as Haiti to impute outcome status Complete Case Analysis Participants with missing data are a random sample of those intended to be observed [15,29] Loss of statistical power [56] Prone to bias [29] Automatically implemented by software Common method Might be biased if participants with missing data are different to those with complete data [15] Survival Analysis LTF is unrelated to mortality Most studies found assumption to be incorrect Survival is usually overestimated Most common method Easy to perform Inverse Probability Weights from Tracing Those unsuccessfully traced have the same mortality as those successfully traced "outcomes are missing at random after accounting for available covariates" [22] Tracing was done at the end of the 10 year follow up period on everyone Case-wise deletion if covariates are missing Tracing can be difficult and expensive Only as successful as your tracing success Loss of statistical power [56] Common method in HIV studies Conceptually easy to understand Best employed for monotone missing data [29] Biased estimate of effect size [56] Residual selection bias [22] Multiple Imputation with Chained Equations Missing are only randomly different from patients with same set of covariates Relies on a good prediction model Susceptible to human error [29] Use all observations Robust standard error Least biased estimates of effect size [56] Gains in precision of estimation of effects [15] If data are not MCAR results might be biased away from the null [29] estimates for program evaluation to estimate survival. However, data should be evaluated for patterns of missingness. Currently, MICE is underutilized in public health research-especially of HIV-infected cohorts. Because the benefits of MICE outweigh the potential for erroneous use, we encourage the use of MICE among our HIV research colleagues.

Additional file
Additional file 1: Data analysis using R is a supplementary file that describes how to download the free statistical software package R and R studio. It also includes the names of the R packages used for this analysis and various websites that one could consult for help using R. (DOCX 12 kb)