Challenges to evaluating complex interventions: a content analysis of published papers

Background There is continuing interest among practitioners, policymakers and researchers in the evaluation of complex interventions stemming from the need to further develop the evidence base on the effectiveness of healthcare and public health interventions, and an awareness that evaluation becomes more challenging if interventions are complex. We undertook an analysis of published journal articles in order to identify aspects of complexity described by writers, the fields in which complex interventions are being evaluated and the challenges experienced in design, implementation and evaluation. This paper outlines the findings of this documentary analysis. Methods The PubMed electronic database was searched for the ten year period, January 2002 to December 2011, using the term “complex intervention*” in the title and/or abstract of a paper. We extracted text from papers to a table and carried out a thematic analysis to identify authors’ descriptions of challenges faced in developing, implementing and evaluating complex interventions. Results The search resulted in a sample of 221 papers of which full text of 216 was obtained and 207 were included in the analysis. The 207 papers broadly cover clinical, public health and methodological topics. Challenges described included the content and standardisation of interventions, the impact of the people involved (staff and patients), the organisational context of implementation, the development of outcome measures, and evaluation. Conclusions Our analysis of these papers suggests that more detailed reporting of information on outcomes, context and intervention is required for complex interventions. Future revisions to reporting guidelines for both primary and secondary research may need to take aspects of complexity into account to enhance their value to both researchers and users of research.


Results:
The search resulted in a sample of 221 papers of which full text of 216 was obtained and 207 were included in the analysis. The 207 papers broadly cover clinical, public health and methodological topics. Challenges described included the content and standardisation of interventions, the impact of the people involved (staff and patients), the organisational context of implementation, the development of outcome measures, and evaluation. Conclusions: Our analysis of these papers suggests that more detailed reporting of information on outcomes, context and intervention is required for complex interventions. Future revisions to reporting guidelines for both primary and secondary research may need to take aspects of complexity into account to enhance their value to both researchers and users of research.

Background
There is continuing interest among practitioners, policymakers and researchers in the evaluation of complex interventions. This interest stems from the need to further develop the evidence base on the effectiveness of healthcare and public health interventions, and an awareness that evaluation becomes more challenging as interventions move along the spectrum from 'simple' towards more complex interventions [1]. This focus on complexity is also driven by ongoing debate about the most appropriate methods for evaluating health systems, and the recognition that it is important to know not just whether health system interventions 'work' , but also about when, why, how and in what circumstances such interventions work well [2,3].
A further stimulus has been the Medical Research Council's (MRC) ' A framework for development and evaluation of RCTs for complex interventions to improve health' , originally published in 2000 [4] and revised and extended in 2008 [5]. This guidance was published in response to the difficulties faced by those attempting to develop complex interventions and evaluate their impact. It describes complex interventions as being 'built up from a number of components, which may act both independently and inter-dependently' [4]. These components include behaviours, behaviour parameters and methods of organising those behaviours, and they may have an effect at individual patient level, organisational or service level or population level (or all of these in some cases). The MRC's 2008 guidance also emphasises the numbers of components and their interactions, behaviours, organisational levels and outcomes, and goes further than the 2000 framework in outlining the variability of desired outcomes and the degree to which flexibility or tailoring of the intervention is permitted. Both documents highlight the importance of establishing both whether an intervention is effective and how it works.
The term 'complex intervention' is now used extensively in the academic health literature to describe both health service and public health interventions. Complex interventions have been the topic of numerous conferences and meetings, the focus of funding calls, and will be the subject of a new chapter in the Cochrane Handbook for Systematic Reviews of Interventions [6]. The common usage of the term indicates increasing recognition of complexity and its implications for the development and evaluation of interventions. It may also be the case that, as the term has achieved wider application, it has come to be used strategically by researchers to add authority and currency to funding proposals and academic articles. However, it is not always clear that 'complexity' is being used to refer to the same things, nor what measures researchers are taking to evaluate it. It has been suggested, for example, that what is described as 'complexity' is actually just 'complicatedness'a very different concept [7].
We undertook an analysis of published journal articles in the field of health in which complexity was an important element. Our aim was to identify the aspects of complexity described by writers; the fields in which complex interventions are being evaluated; and to describe challenges experienced due to the complexity of interventions and how authors dealt with these. This paper outlines the findings of this documentary analysis focusing, in particular, on the challenges of designing, implementing and evaluating complex interventions described by authors.

Search strategy
The PubMed electronic database was searched for journal articles published in the ten year period, January 2002 to December 2011. The start date was chosen to allow enough time for papers referring to the MRC guidance (2000) to have been published. The search identified the term "complex intervention*" in the title and/or abstract of a paper, excluding papers not written in English. Research reports, trial protocols, systematic reviews, meta-analyses, discussion pieces, published oral presentations and letters were included. We then undertook a content analysis of the papers to identify authors' descriptions of challenges faced in developing, implementing and evaluating complex interventions. The search was undertaken systematically as described above but we did not conduct a critical analysis of each paper as our principal aim was to provide a snapshot of current practice rather than a comprehensive review.

Analysis
Having read the papers, we extracted text to a table from each paper. Columns included title, author, study topic (e.g. clinical, public health, etc.), definitions of complex interventions used by authors, problems identified by authors (using the search terms: challenge, barrier, difficult, limit to identify difficulties described), and cited literature on complex interventions.
The process of analysing the papers' content and identifying challenges described by authors produced a number of themes which we used to structure the results section. These are: intervention design (descriptions of challenges derived from the nature or content of the intervention); intervention implementation (challenges in implementing complex interventions); contextual characteristics (aspects of context that may influence implementation or evaluation of complex interventions); outcomes (reflecting the difficulties posed by the outcomes of complex interventions) and evaluation (describing challenges to evaluation). We are not suggesting that these themes are mutually exclusive. The design and content of an intervention, for example, are influenced by the context in which it will be implemented and the methodology used to evaluate it. Quotes from papers were selected to illustrate issues raised by authors.

Results
The search resulted in a sample of 221 papers of which full text of 216 (98%) was obtained and 207 were included in the analysis. Nine papers were excluded because their subject matter was not relevant for our purposes. A small number of the papers included were published online in 2011 but in print in 2012.
The 207 papers broadly cover clinical (45%), methodological (27%), health promotion (23%) and public health (3%) topics with a small number of 'others' (1%). All those included in the analysis are listed in Table 1. Some papers focus on particular health conditions, such as cancer, diabetes, HIV and mental illness; some on health and social interventions, including palliative care services, complementary therapies and decision aids; and others on methodological and theoretical issues such as causality, the use of normalisation process theory, and approaches to health promotion.

Use of MRC guidance
As noted above, MRC guidance on the development and evaluation of complex interventions [4,5] has been available since 2000. Without making assumptions about    . 43% (n = 90) of papers did not cite the guidance documents or any of the above key references.

Intervention design
Here we outline the challenges described by writers in deciding upon and standardising the content of interventions which may include a number of components.

The value of a theoretical understanding
The MRC guidance advises that intervention design should be based on a theoretical understanding of how an intervention causes change. Some papers focused on the development of an explanatory framework or rationale to inform intervention design and evaluation. These included, for example, one aimed at identifying and differentiating the components of two approaches to acupuncture (biomedical and traditional). These authors describe using a 'realist review' approach to develop an analytical framework for their review: 'Its first step is to uncover or identify the essential or implicit theory or theories that underlie an intervention, that is, how the intervention is thought or meant to work and its expected impacts.' [12] Another research team described the process of developing an optimal complex physical therapy intervention for patients with hip osteoarthritis 'in light of current knowledge and expert opinion' given the lack of understanding about how individual components of the therapy affect the disease process [13]. In this case, the development of a theoretical framework meant collecting evidence to help understand the aetiopathogenesis and physical impairments associated with the condition.
Others explained how they used existing models and theories to inform interventions and evaluation design. Drawing on previous studies, Borglin, Gustafsson and Krona [14] describe using the Theory of Planned Behaviour to develop a series of workshops for nurses to improve pain management for cancer patients. The Normalisation Process Model [15] was used as a theoretical framework in two RCTs in maternity care and was reported to be of value in understanding organisational contexts into which new models of care are introduced [16].
' …the use of this theoretical model will deepen our understanding of which factors contribute to the legitimacy of an intervention and thus the likelihood that it will be sustainable.' [16] Even if evidence is available, it may not be possible to predict which elements of an intervention will be acceptable to health care staff and patients, have the desired effect and be sustainable. It may also be difficult to define exactly what will constitute the intervention: '…developing precise inclusion criteria for such complex interventions is more problematic, because by definition it is not clear a priori which mechanisms have to be in place in order to define an intervention as "collaborative care".' [17] Nonetheless, the authors report that developing a theoretical framework early in a study enables attention to be focused on what needs to be done to plan, implement and sustain an intervention and what is less important.

Standardisation and treatment fidelity
Implementing any intervention in a standard format across sites is not straightforward but standardising complex, multiple treatment interventions, which may have a number of interacting components, is difficult and, some researchers argue, standardising the form of an intervention rather than its function may not be appropriate [18]. Two main challenges to standardisation were identified in these papers: on the supply side, the likelihood of variation in the delivery of services (e.g. [19]), and, on the demand side, the wide range of patients' diagnoses, stages of disease, needs and preferences (e.g. [20]).
'Because of heterogeneity regarding settings, experiences, training, etc. and lack of standardisation, it is very difficult to compare different HPCTs [hospital palliative care teams]; hence the need for careful definition.' [21] 'In the example given […] of a Computer Decision Support System, is the intervention the software or the combination of the software and the staff working in the call centre?' [15] Attempting to standardise an intervention to meet the needs of researchers may lead to perverse outcomes: 'The advantage of standardisation [in acupuncture interventions] must be offset against the disadvantage that such treatments, when obviously inadequate or inappropriate, cannot be modified, as would normally occur in routine clinical practice.' [22] A degree of flexibility in the design and implementation of interventions was advocated by a number of writers with the aim of ensuring that interventions could be adapted to both local circumstances and to patients' needs.
'…it is important to retain some flexibility, allowing adaptation of the intervention to the local context and ensuring the intervention can be tailored for individual OHC [oral health care] needs.' [23] As well as disparity in delivery, differences in the frequency of interventions and lack of a precise definition of the start of treatment were described (e.g. [24]). The MRC guidance [5] asserts that 'any variation in the intervention needs recording, whether or not it is intended, so that fidelity can be assessed in relation to the degree of standardisation required by the study protocol'. Replicability would be compromised by undocumented variation.
In order to record how implementation is carried out on the ground, the authors of one paper (on the topic of secondary prevention of heart disease in general practice) suggest using a range of treatment fidelity procedures to monitor the intervention and to capture the processes involved. These procedures enhance validity and reliability with the aim of 'reducing errors in the interpretation of study outcomes and attributing outcomes directly to the effect of the intervention' [25]. Examples described include standardised training sessions, project manager observation, quality assurance visits to practices during intervention implementation, use of a structured recall system, research nurse observation of general practitioners and practice nurses during intervention consultations and use of practice and patient care plans to document the process of intervention delivery.

Intervention implementation
To implement an intervention one must think at an early stage about who will be responsible for what and in what setting [5]. In the case of complex interventions, there may be a number of individuals, institutions or agencies involved across several sites. After an intervention has been trialled or evaluated (and depending on the outcome), consideration should be given to its sustainability and the ease with which it can be integrated into usual service. In this section, we consider the challenges -ranging from the philosophical to the practicalidentified by writers in implementing interventions.
'Even when the concept of RRS [rapid response systems] is believed to be advantageous, the actual implementation entails overcoming a myriad of barriers: political, financial, educational, cultural, logistic, anthropological, and emotional.' [26] Structural and logistical obstacles may have an impact on effective implementation in the 'real world' where it is not always possible to control activities and outcomes.
'Campus Watch has undergone many changes, both structural and functional, since it was introduced in 2007; its evolution has not been guided by an overarching design and modifications have occurred for reasons that have not always been well documented.' [27] Staffing issues Those at the front line of 'delivering' an intervention may face time and resource difficulties or lack of buy-in with the aims of the intervention while there may be political and/or financial considerations further up the organisational hierarchy. The replication, regulation and sustainability of new practices in diverse teams across a number of sites can make heavy demands on staff who may experience competing priorities if they are also involved in data collection for evaluation purposes: 'There was no systematic exploration of midwives' views of working in the models post RCT, or of the views of other stakeholders such as non-team midwives, managers and obstetric staff during or after completion of the team RCT, nor during the subsequent iterations of the team model. Therefore it is not possible to draw conclusions about why the original evaluated model was not sustained.' [16] 'In …the area where breastfeeding rates did not improve, health professional support for the project was weaker and relationships between midwives and health visitors were problematic.' [28] 'It is clear that teachers found it difficult to deliver the programme for a variety of logistic reasons (low morale, lack of support and competing priorities at school) and contextual reasons (difficulty teaching about sensitive issues, switching from their traditional teacher role, and lack of trust between pupils and teachers).' [29] Implementing an intervention uniformly may create difficulties for clinicians whose first aim is to provide the most effective care to patients. The papers present examples of treatment that deviated from the protocol because of decisions made by staff: ' At least two control patients are known to have received more intensive physical therapy, i.e. musclestrength training, than they would have otherwise. We believe that once the surgeons sensed that patients receiving intensive physical therapy were responding well, the surgeons were likely to have encouraged their patients to get more physical therapy, thus further diluting the impact of the intervention.' [30] Patient issues A number of issues relating to patients were raised by authors. These included patients' preferences and patient/staff interaction, and recruitment and retention to trials. Studies about the treatment of chronic illness, for example, emphasised the role of patients (and carers) in active management of health conditions [31,32] Less positively, one paper reported that for a number of reasons 'despite initial willingness, after a few weeks some patients [suffering from psychosis] no longer wanted to receive therapy' [33]. A review on the topic of patients with medically unexplained symptoms reported that patients distrusted doctors regarding emotional aspects of their problems while doctors were concerned about encouraging patient dependence [34].
Those conducting trials reported that recruitment and retention of participants may be negatively affected if the intervention targets patients who are severely ill or who are hard to reach. Examples reported included patients with advanced dementia and their carers [35], those receiving palliative care [21] and young drug users [36]. In the first example, unbiased comparisons could not be made between intervention and control groups because of sample attrition [35]. In their consideration of the strengths and weaknesses of a beforeafter study design, Simon and Higginson [21] offer suggestions for strategies (including inclusion of a control group in research design, time series approaches, and more robust outcome measures) to control and limit secular trends, bias and confounders. Garfein and colleagues [36] describe one method used to retain participants: 'Given the anticipated difficulty in retaining young IDUs [intravenous drug users]for a longitudinal study, follow-up window periods were designed such that the need for high retention was balanced with the need for uniform intervals between the intervention and follow-up assessments.' In evaluating an intervention aimed at high-utilising patients with medically unexplained symptoms, Lyles and colleagues [37] describe how they achieved their impressive retention rate of 98%: 'Remunerating participants in recognition of their time commitment helped to maintain interest. However, consistent, clear communication from project staff and persistence in contacting participants were also important factors in enrolling and retaining subjects. We maintained a communication link with participants at intervals throughout the project.'

Contextual characteristics
Complex interventions, by their nature, are more likely than simpler ones to depend for their success on the context in which they are implemented [38]. Authors described the impact of structural, capacity, professional and political factors on their introduction. The most commonly cited contextual barrier to implementation was the organisational context. As one author put it: 'The findings concur with previous studies, which suggest that organisational environment and culture, and client factors may influence occupational therapy practice.' [39].
Organisational context encompassed a wide range of elements from the parochial to the regional or national level and included organisational cultures, such as hierarchies and professional boundaries, staffing arrangements, social, geographical and environmental barriers, and the impact of other simultaneous organisational changes. The organisational context could either help or hinder the implementation of an interventionor do both at the same time.
'More attention should be given to the systems into which policies and complex interventions intervene. Particularly how the negative consequences of the environment, resource shortages, organisational change, competing demands and leadership affect an organisation's ability to effectively deliver an intervention.' [40] 'The difficulties of delivering complex interventions in inner city areas are well known to clinicians, and might be attributed variously to low levels of social support, high levels of deprivation, and relative residential instability. Such contextual disadvantages remain a therapeutic challenge.' [33] ' Although the changing of long-term entrenched practices of physicians and other professionals is known to be a difficult task, problem solving in expanding cycles was able to affect such a change and produce an effective cervical cancer screening programme with no increase in financial resources.' [41] Another example of an organisational barrier to implementation was lack of support for what were seen as demanding projects. GP practice staff, for example, were thought to have few incentives for engaging in thinking through and developing complex new service arrangements: 'Furthermore the external environment was not a sufficiently supportive context for the scope of the proposed shared care developments: it was seen as "a big project".' [42] Outcomes Having established what outcome(s) an intervention is aiming to achieve, researchers face challenges in designing tools to effectively measure outcomes, understanding 'the length and complexity of the causal chains linking intervention with outcome' [5], explaining discrepancies between expected and observed outcomes, and capturing the long term characteristics of outcomes after a trial or study is concluded.

Multidimensional outcome measures
Outcomes are likely to be plural and multi-dimensional, spanning 'the spectrum from mortality, morbidity, disability, to satisfaction and cost' [43] as 'restricting the success indicator to one single health or behavioural outcome leads to many unsolved questions about the success factors for, and barriers to, the effectiveness of the intervention' [44]. Clinical pathways are aspects of complex interventions that may demand outcomes be measured across many domains including clinical, service, team, process and cost [38]. As well as breadth, outcome measures must take time into account and may be designed for the short, medium or long term or all three.
'Given this degree of complexity identifying a single primary outcome measure to capture the impact of an OHC [oral health care] intervention is problematic. We would anticipate that a multifaceted OHC intervention would impact upon a range of components including for example dental referrals, staff knowledge and patients' oral health.' [23] 'It is therefore critical that the impact of new models of care are rigorously evaluated, considering outcomes for women and infants as well as outcomes for midwives and other maternity care providers.' [16] Assessing outcomes Apart from difficulties in deciding upon measureable outcomes imposed by the complexity of interventions, writers noted that there is now an expectation that the bio-psycho-social aspects of interventions be measured as well as the clinical ones [45]. In palliative care, for example, patient experience is the primary outcome [46]. In general, it is argued that patient-centred outcomes, such as quality of life, as well as the views and experiences of staff should be taken into account. Some authors suggested that methods of measuring outcomes did not always capture the positive impact of an intervention and, in some cases, described their use of qualitative data to measure patient experience (e.g. [47]).
'The lack of an objective outcome was in contrast to subjective feedback from the study participants who felt that the intervention had produced a change in practice.' [48] 'Reliance on empirical and societal defined outcomes often hides success in terms of participant defined outcomes.' [19] Establishing 'hard' outcome measures was seen to be difficult in particular fields where the success of an intervention does not necessarily equate with patient improvement or survival.
'The holistic approach of palliative care and its services causes some problems in defining clear outcomes and finding valid measurements.' [21] 'There is a lack of an accepted primary outcome regarding the use of decision aids. Possible categories to classify measures of effectiveness are knowledge, decision process (e.g., satisfaction and participation preference), decision outcomes (e.g., has a treatment decision been made, adherence), health status, and economic measures.' [49] Some writers admitted that it was not possible to attribute the 'active ingredient' [4] of a complex intervention to a particular component of its design: 'If this complex intervention does reduce mortality the relative contributions of education, PEWS [paediatric early warning system] and MET [medical emergency team] to clinical effectiveness is unknown.' [50] 'In many cases, the effectiveness of training is more difficult to measure because a wide range of variables unrelated to the training intervention can mediate both the training process and the outcome. These variables need to be considered if it is to be established whether an outcome is due to the training intervention or other unrelated factors. For instance, variables related to the individual have been shown to mediate impact on outcomes like stress and burnout levels, and staff satisfaction.' [51] Evaluation The process of evaluating health service interventions occurs before, during and after implementation. In this section, we highlight some important issues raised by authors but do not systematically describe the many research designs which are the subject of the papers themselves.

Formative and process evaluation
The 2008 MRC guidance suggests that ' A mixture of qualitative and quantitative methods is likely to be needed, for example to understand barriers to participation and to estimate response rates' [5] to assess the feasibility of an intervention. As noted above, qualitative data are increasingly recognised as 'an essential component of health services research' [52], providing insights into the acceptability of interventions and their social consequences which cannot be measured by quantitative approaches. Formative evaluationconducted to aid intervention designcan offer insights into the views and priorities of both patients and practitioners.
'The key to the successful development of the complex intervention was the use of qualitative research that ensured that the intervention was based on data from interactions in ongoing trial recruitment appointments. Exploratory qualitative research of recruitment appointments in the Protect feasibility study showed that improvements to the presentation of study information increased rates of randomization from 30% to over 65%.' [53] Process evaluation is particularly important in multisite trials, 'where the "same" intervention may be implemented and received in different ways' [11].
'Neither quantitative nor qualitative approaches alone would provide an adequate insight into the implementation of the intervention across all three levels of care, from the perspective of all involved and capture the information needed in relation to both effectiveness and feasibility issues.' [23]

Limitations
The number and range of papers discussed here are not comprehensive given the search terms used and database searched and selection bias is therefore possible. However, we feel that there is a large enough number included for our purposes. We conducted a content analysis rather than a systematic review which supported our aim of identifying aspects of complexity in health interventions, the fields in which they are implemented and the challenges experiences by researchers.

Summary of results
The literature on complex interventions is thick with descriptions of complex, challenging interventions, but thin on practical advice on how these should be dealt with. In the papers we surveyed, authors pointed to the practical value of theory in determining which features of an intervention and its context are likely to be important in influencing outcomes and determining sustainability. They caution against attempting to too narrowly define and standardise the intervention, drawing on Hawe and colleagues lead (standardising on 'function' rather than 'form') [10]. This also means having procedures in place to document what is actually done under the heading of 'the intervention'.
The interaction between intervention and context is frequently emphasised, and one aspect of context which is highlighted in several papers is the people involved, including staff and patients themselves. The MRC guidance notes that complexity may derive from interaction between patient or recipient and provider. The implication for implementation and evaluation is that (in the case of healthcare interventions) barriers at both levels should be considered and mitigated and, in the case of evaluation, relevant data collected. These barriers could also be built into the initial logic model driving the evaluation [54]. The papers also point to the wide range of contexts which have been considered as relevant, including professional boundaries and hierarchies, which do not often feature in descriptions of context, but are clearly relevant in some of these examples. In one study the specific recommendation is made that attention should be given to the systems into which complex interventions are placed [40]. In practical terms this may mean describing those systems in detail and at different levels and theorising on how they may affect the effectiveness of implementation.
Several studies point to a multiplicity of health and non-health outcomes as a source of complexity. In many of the papers which raise this as an issue, there is an implicit need for outcome measuresor a range of outcome approachescapable of capturing outcomes across different dimensions and time scales. This may imply a move away from a focus on primary outcomes and a small number of secondary outcomes towards a much more multi-criteria form of assessment which acknowledges the multiple objectives of many complex interventions.

Implications
The above comments may have implications for reporting of studies of complex interventions. The quotes suggest that more detailed reporting of information on outcomes, context and intervention is required for complex interventions. However, reporting guidelines for quantitative studies may require further adaptation to enable adequate explanation of complex interventions, and the contexts within which they were implemented. Defining and describing context, for example, may prove particularly challenging and, given the inherent flexibility in complex interventions themselves, even defining the intervention may be difficult. Future revisions to reporting guidelines for both primary and secondary research may need to take aspects of complexity into account to enhance their value to both researchers and users of research.