Public health interventions affect large population groups and can generate significant health benefits at individual and population levels. Even though many public health approaches are preventive in nature, intervening in people’s lives may nevertheless do harm as well as good. In addition, they consume both financial and human resources, and may compromise individual freedom of choice. Public health interventions range from programmatic activities that initiate direct, proximal changes in a specific technology or behaviour to those that bring about more distal changes in multi-sectoral policies with indirect impacts on health . These interventions often combine several approaches that are designed by and delivered through the health sector and/or other sectors .
Evaluating public health interventions is far from straightforward and there is much discussion as to how evidence should be gathered, synthesised and used in decision making [3–9]. Developing recommendations or policies in public health relies on complex judgements about a range of factors including magnitude of the health problem, benefits and harms of a given intervention, use of personnel and financial resources, transferability, as well as intervention acceptability and feasibility. Making the decision-making process explicit and transparent is critical, as is a careful examination of the types of evidence underlying specific judgements and, in particular, the quality of evidence in support of likely benefits and harms.
Public health organisations in different countries have developed distinct approaches to convey the quality of the evidence .These include the Guide to Community Preventive Services (Community Guide) issued by the United States Community Preventive Services Task Force [11, 12], Public Health Guidance developed by the National Institute for Health and Clinical Excellence (NICE)  and the Netherland’s Organisation for Public Health’s recognition system for health promotion interventions [14, 15]. While, to our knowledge, these have not been formally compared, the use of many different schemes in parallel may lead to a divergent rating of the quality of evidence and conflicting recommendations. This may hinder the goal of helping guideline developers and policy-makers make well-informed decisions in a transparent way, both nationally and internationally .
The Grading of Recommendations Assessment, Development and Evaluation (GRADE) Working Group, a network of primarily clinical guideline developers and systematic reviewers, has attempted to meet this challenge by developing and testing a rigorous, systematic and transparent framework for evidence-based guideline development [17–19]. In a first step, the quality of the evidence (defined as the extent of our confidence that the estimate of effect is correct and/or that this estimate is adequate to support a particular recommendation) is classified in one of four categories: high, moderate, low or very low quality [20, 21]. Randomised controlled trials begin as high quality, while observational study designs (including non-randomised or quasi-randomised intervention studies as well as cohort studies, case control studies and other correlational study designs) begin as low quality. The quality of the evidence can subsequently be rated down based on five criteria (i.e. risk of bias, inconsistency, indirectness, imprecision, publication bias) or rated up based on three criteria (i.e. strong association, dose-response gradient, plausible confounding). In a second step, the strength of a recommendation (defined as the extent to which we can be confident that desirable effects of an intervention outweigh undesirable effects) is graded as either strong or weak (conditional or discretionary). This judgement of the strength of a recommendation is based on magnitude of desirable and undesirable consequences, quality of evidence, values and preferences and resource use.
More than 65 national and international organisations have adopted the GRADE approach (see http://www.gradeworkinggroup.org/society/). While the GRADE Working Group promotes the use of the framework across clinical and non-clinical health evidence, there has been much debate in the literature [22–28] and within public health organisations (e.g. European Centre for Disease Prevention and Control, Swedish National Institute of Public Health, Canadian Public Health Agency, World Health Organization) as to whether this scheme is well-suited to public health interventions.
As a contribution to this debate, the objectives of this study were to review current use of and experience with the GRADE approach in rating the quality of evidence in the field of public health, and to identify challenges encountered.