Clinical practice guidelines (CPG) serve to guide clinical practice and inform quality improvement programs by generating clinical standards and performance measures. Clinicians will only use guideline recommendations if they perceive them to be evidence-based, unambiguous, and feasible within routine care.1 Despite the proliferation of CPGs, direct evidence of impact on quality of care or patient outcomes is limited.2 Several explanations exist,3 including perceptions of bias in the development of guideline recommendations.4 Recently published claims5 and counter-claims6 of bias in Australian guidelines7 and related position statements8 regarding the management of acute coronary syndromes highlight problematic issues in CPG development. These include conflicts of interest (COI) of guideline panellists, validity and strength of recommendations, and involvement of external stakeholders and end users. We offer strategies for dealing with these issues in a transparent and explicit manner.
The quality of CPG bears little relation to the level of seniority or expertise of guideline authors.9 Guideline panellists often harbour COI that may not be fully evident, even to the panellists themselves, but which can potentially bias their recommendations.10 These conflicts include not only financial ties with industry but also practice reimbursement incentives, professional affiliations and practice specialisation, intellectual attachment to their own studies, ideas and innovations, and desire for academic recognition and career advancement.11 The most entrenched conflict can be a disinclination to challenge or reverse strongly held beliefs. Using research evidence to make recommendations requires subjective interpretations, which will be influenced by the value structure of panel members.12 Vulnerability to preconceptions is greatest for recommendations based on low-quality evidence — an increasingly frequent occurrence in contemporary CPG13 — although recommendations based on high-quality evidence are far from invulnerable.
Most current guidelines remain susceptible to COI, which can impinge on all stages of the CPG development process (Box 1). Many are published without peer review or, if contained in journal supplements, escape the standard of peer review applied to articles published in the parent journal.14 Moreover, many guidelines (79% in a recent survey of Australian guidelines15) fail to mention possible competing interests of guideline panellists. Even if COI are disclosed, guideline users may not adjust their perceptions of recommendations in response to such disclosures.16
Strategies for dealing with COI are outlined in Box 1,17-19 with key strategies being:
Nominated panellists must disclose all industry-related professional activities, including research grants and speaker support, and, for the duration of guideline development, divest themselves of direct financial interests (stock ownership, board positions, consultancy agreements) in commercial companies with an interest in any guideline recommendation.
Panellists are required to identify all sections of the draft guidelines for which they have COI. These conflicts are recorded in a COI grid maintained by the guideline chairperson.
Methodologists free of financial or intellectual conflicts of interest share responsibility with content experts for collecting and interpreting evidence.
Explicit processes must be used to assess evidence quality and link this directly with strength of recommendations.
Only conflict-free panellists (both methodological and content experts) are involved in determining the direction (for or against a specific clinical action) and strength of recommendations.
Lack of consensus around evidence quality or recommendations is resolved by explicit democratic processes (such as Delphi rounds and nominal group techniques) involving conflict-free panellists who have thoroughly reviewed the related evidence.
Individuals should be invited to join guideline panels through an open, transparent application process centred on selection criteria that ensure an appropriate balance of content and methodological expertise. Such criteria may comprise extent of clinical experience with the topic in question, prior participation in undertaking critically appraised literature reviews, intended commitment in time and intellectual input into the guideline development process, and referee reports. For guidelines that deal with common conditions and are aimed at large, multidisciplinary audiences, panel composition should reflect the spectrum of end users and avoid being dominated by a narrow spectrum of specialists.
The impact on guideline content if such policies were implemented and enforced is yet to be empirically determined,20 but many organisations involved in guideline development have now adopted at least some of them as best practice for reducing the probability of conflicted panellists having undue influence.17-19
Clinicians lose confidence in CPG when separate guidelines on the same clinical topic from seemingly authoritative sources produce conflicting recommendations. For example, United States and European CPGs differ in their recommendations for use of anticoagulants in acute coronary syndromes.21 The ways in which guideline panellists have interpreted and weighted the evidence and used it to formulate recommendations of different strength must be clearly communicated.
While various systems exist for rating evidence according to hierarchies of study design, with randomised controlled trials (RCTs) at the top, most contain no explicit processes for assessing evidence quality or linking it with recommendations.22 The Grading of Recommendations Assessment, Development and Evaluation (GRADE) system (Box 2) attempts to meet this need23 and has advantages over other grading systems in that it:
clearly separates quality of evidence and strength of recommendations;
explicitly evaluates alternative management strategies;
provides clear-cut, detailed criteria for downgrading and upgrading quality of evidence ratings related to different outcomes that are of importance to patients and which exclude surrogates;
provides a transparent process for moving from evidence to recommendations and grading recommendations as strong or weak on the basis of clearly defined, pragmatic interpretive criteria;
explicitly acknowledges patient preferences; and
details potential resource use.
Adopting the GRADE system may assist in exposing and mitigating bias arising from COI, thus augmenting COI policies. Applying the GRADE quality of evidence classification to contemporary CPG suggests that many (in one study almost 50%24) RCT-derived recommendations fail to meet a-priori definitions of high-quality evidence. Several examples now exist of how the GRADE system promotes the development of CPG recommendations that are more aligned with evidence quality.25,26
More than 50 organisations worldwide have adopted the GRADE system, including the World Health Organization, the American College of Physicians, the Cochrane Collaboration, the Scottish Intercollegiate Guidelines Network (SIGN) and UpToDate (online clinical decision support system). In Australia, the National Health and Medical Research Council (NHMRC) has recently produced a revised draft schema for more explicit, structured grading of evidence quality and strength of recommendations which has several similarities — and differences — to the GRADE system.27
Guidelines commonly base their recommendations on trials involving selected populations and standardised interventions. These may not be applicable to unselected populations receiving care from clinicians working under real-world constraints.3 Benefits reported in trials may not be reproducible in patient groups, such as older patients with multiple comorbidities, that are underrepresented in such studies or in clinical settings very different to those used in trials.28
Guideline panellists should assess the extent to which evidence of treatment benefit is consistent, or even exists, across different populations with different comorbidity spectra, in different settings and with different modes of treatment administration. The circumstances under which the magnitude of treatment benefit (and harm) is significantly enhanced or attenuated should be highlighted in the way recommendations are presented. Recommendations should, where appropriate, stratify populations according to disease risk and target treatments to those who will experience greatest net benefit.29
Guideline authors must avoid exercising power without responsibility in obliging clinicians and health services to enact recommendations and satisfy guideline-based performance measures with little regard to the added problems and pressures these may engender in terms of professional interactions, team functioning, organisational predispositions, resource availability and medicolegal considerations.3,30
In developing guidelines, transparent processes are needed that deal with potential COI, rate the quality of evidence and strength of recommendations, and address real-world needs of guideline users. The strategies outlined here, if adopted by guideline panels, may limit protracted interpretive debates and correct deficiencies that inhibit a wider use of CPG. While they potentially impose more effort, cost and delay in developing guidelines, we believe these imposts are outweighed by the minimisation of recommendations that are biased, poorly substantiated or insensitive to patient and clinician needs and which, if followed, may have far-reaching deleterious effects on clinical practice.
1 Steps in developing clinical practice guidelines (CPGs), potential conflicts of interest (COI) and potential solutions17-19
2 The Grading of Recommendations Assessment, Development and Evaluation (GRADE) system23
GRADE proposes that the quality of evidence associated with each outcome of importance to patients be evaluated separately. The GRADE system classifies quality of evidence into 4 levels: high, moderate, low or very low. Evidence from randomised controlled trials (RCTs) begins as high quality, but may be rated down if trials demonstrate one of five categories of limitations. Observational studies begin as low-quality evidence, but may be rated up if associated with one of three categories of special strengths.
Reasons for rating down quality of evidence
Risk of bias: Quality will be lower if most of the evidence from available RCTs is compromised by limitations such as: lack of allocation concealment; lack of blinding (particularly if outcome assessment is highly susceptible to bias); large losses to follow-up; failure to analyse patients in the groups to which they were randomised; premature termination for benefit; or failure to report outcomes (often those for which no effect was observed).
Inconsistent results: Widely differing estimates of treatment effect across studies suggest true differences in underlying treatment effect, and if investigators fail to identify a plausible explanation, quality of evidence decreases. Variability may arise from differences in populations, interventions, or outcomes.
Indirectness of evidence: In comparing effects of two active treatments, randomised head-to-head trials constitute high-quality evidence. Indirect comparisons of the magnitude of effects seen in separate placebo-controlled trials of each treatment constitute lower quality evidence. Another type of indirectness arises if there are important differences between the populations (eg, elderly v non-elderly), interventions (eg, low v high dose) and outcomes (patient-important v surrogate) measured in trials and those under consideration in the guideline.
Imprecision: When studies include relatively few patients and few events and thus have wide confidence intervals, quality of evidence decreases.
Publication bias: Failure to report studies that typically show no effect reduces evidence quality. Such publication bias is more likely when only a small number of trials, all funded by industry, are available.
Reasons for rating up quality of evidence
Large and consistent effect sizes: If several large and methodologically strong observational studies report a very large effect size and confounding is unlikely to explain all or most of the apparent benefit, quality of evidence can be rated up (eg, hip replacement in severe osteoarthritis or dialysis for end-stage renal failure).
Presence of a dose–response gradient: Where intensity of intervention (eg, dose, duration, or parenteral v oral method of administration) shows a correlation with effect size, the quality of evidence may increase.
Accounting for all plausible confounding: Where investigators have accounted for all plausible biases which might decrease the magnitude of an apparent effect or create a spurious effect when results show no effect, the quality of evidence increases.
Grading strength of recommendations
The GRADE system grades recommendations as “strong” or “weak” based on four determinants: quality of evidence, trade-off between desirable and undesirable consequences, variability in patient values and preference, and resource use. When desirable effects of an intervention clearly outweigh undesirable effects, or vice versa, and estimates are based on high-quality evidence, the recommendation is strong. When trade-offs are less certain (lower quality evidence or desirable and undesirable effects closely balanced), the recommendation is weak. Also, the greater the variation in values and preferences of patients (and/or informed proxies), or the greater their uncertainty, the more likely a weak grading is warranted. Similarly, the more uncertain it is that an intervention represents a wise use of resources (eg, a marginal net benefit of a very resource-intensive intervention), the lower the likelihood of a strong grading.
Provenance: Not commissioned; externally peer reviewed.
Abstract
A recently published critique of a set of Australian clinical practice guidelines (CPG) highlighted problematic issues in guideline development concerning conflicts of interest of guideline panellists, validity and strength of recommendations, and involvement of end users and external stakeholders.
Management of financial or intellectual conflicts of interest requires: full disclosure; limitations on industry or agency financial support during guideline development; a representative panel that includes conflict-free members; and only conflict-free panellists to be involved in drafting guideline recommendations.
Guideline panels should consider adopting the GRADE (Grading of Recommendations Assessment, Development and Evaluation) system to assist in determining the validity and strength of recommendations.
Guideline panels should seek formal feedback from external stakeholders and end users.
Enacting such policies aims to lend greater transparency and credibility to CPG, limit protracted and unhelpful interpretive debates, and promote wider use of CPG.