help button home button Endocrine Society JCEM
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS

Journal of Clinical Endocrinology & Metabolism , doi:10.1210/jc.2007-1907
This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Supplemental Data
Right arrow Submit a related Letter to the Editor
Right arrow Alert me when this article is cited
Right arrow Alert me when eLetters are posted
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow Request Copyright Permission
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Swiglo, B. A.
Right arrow Articles by Montori, V. M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Swiglo, B. A.
Right arrow Articles by Montori, V. M.
Related Collections
Right arrow Adrenal and Hypertension
Right arrow Neuroendocrinology and Pituitary
Right arrow Pediatric Endocrinology
Right arrow Thyroid
Right arrow Other
Right arrow Lipid
Right arrow Autoimmunity
Right arrow Calcium and Bone Metabolism
Right arrow Cardiovascular Endocrinology
Right arrow Diabetes and Insulin
Right arrow Endocrine Oncology
Right arrow Female Endocrinology
Right arrow Male Endocrinology
Right arrow Metabolism
Right arrow Obesity
The Journal of Clinical Endocrinology & Metabolism Vol. 93, No. 3 666-673
Copyright © 2008 by The Endocrine Society


REVIEW

A Case for Clarity, Consistency, and Helpfulness: State-of-the-Art Clinical Practice Guidelines in Endocrinology Using the Grading of Recommendations, Assessment, Development, and Evaluation System

Brian A. Swiglo, M. H. Murad, Holger J. Schünemann, Regina Kunz, Robert A. Vigersky, Gordon H. Guyatt and Victor M. Montori

Knowledge and Encounter Research Unit (B.A.S., M.H.M., V.M.M.), Divisions of Endocrinology, Preventive Medicine, and Internal Medicine, Mayo Clinic College of Medicine, Rochester, Minnesota 55905; Clinical Advances Through Research And Information Translation Research Group (H.J.S., G.H.G.), Department of Clinical Epidemiology and Biostatistics, Faculty of Health Sciences, McMaster University, Hamilton, Ontario L8S4L8, Canada; Department of Epidemiology (H.J.S.), Italian National Cancer Institute "Regina Elena," 00161 Rome, Italy; Basel Institute for Clinical Epidemiology (R.K.), University Hospital Basel, CH-4031 Basel, Switzerland; and Diabetes Institute (R.A.V.), Walter Reed Health Care System, Washington, D.C. 20307

Address all correspondence and requests for reprints to: Victor M. Montori, M.D., M.Sc., Mayo Clinic, W18A, 200 First Street SW, Rochester, Minnesota 55905. E-mail: montori.victor{at}mayo.edu.


    Abstract
 Top
 Abstract
 Introduction
 Developing Rigorous and Helpful...
 The GRADE System
 Strength of Recommendations
 Quality of the Evidence
 Values and Preferences
 Future Directions
 Conclusions
 References
 
Context: The Endocrine Society, and a growing number of other organizations, have adopted the Grading of Recommendations, Assessment, Development, and Evaluation (GRADE) system to develop clinical practice guidelines and grade the strength of recommendations and the quality of the evidence. Despite the use of GRADE in several of The Endocrine Society’s clinical practice guidelines, endocrinologists have not had access to a context-specific discussion of this system and its merits.

Evidence Acquisition: The authors are involved in the development of the GRADE standard and its application to The Endocrine Society clinical practice guidelines. Examples were extracted from these guidelines to illustrate how this grading system enhances the quality of practice guidelines.

Evidence Synthesis: We summarized and described the components of the GRADE system, and discussed the features of GRADE that help bring clarity and consistency to guideline documents, making them more helpful to practicing clinicians and their patients with endocrine disorders.

Conclusions: GRADE describes the quality of the evidence using four levels: very low, low, moderate, and high quality. Recommendations can be either strong ("we recommend") or weak ("we suggest"), and this strength reflects the confidence that guideline panel members have that patients who receive recommended care will be better off. The separation of the quality of the evidence from the strength of the recommendation recognizes the role that values and preferences, as well as clinical and social circumstances, play in formulating practice recommendations.


    Introduction
 Top
 Abstract
 Introduction
 Developing Rigorous and Helpful...
 The GRADE System
 Strength of Recommendations
 Quality of the Evidence
 Values and Preferences
 Future Directions
 Conclusions
 References
 
Professional organizations, such as The Endocrine Society and its sister societies, have set out to develop clinical practice guidelines to provide helpful recommendations to practicing clinicians, to improve quality of care, and to enhance patient outcomes. By producing guidelines, these organizations seek to assert their academic and practice leadership in areas of primary concern. Given the policy and legal implications of guidelines, state-of-the-art guideline developers follow rigorous and transparent procedures for formulating recommendations for or against a particular diagnostic or therapeutic intervention. Guidelines are strengthened further if they involve panel members without substantial conflicts of interest (i.e. members who do not expect to benefit directly or indirectly, now or in the future, personally or financially, from making a particular recommendation) and conduct their proceedings without for-profit support. Key to their success is the expectation that clinicians will deliver better care for their patients if they follow guideline recommendations. Thus, clinicians need to find the recommendations both clear and helpful.

In this article we will discuss the processes involved in developing helpful and rigorous clinical practice guidelines in a manner congruent with the approach The Endocrine Society has adopted. We anticipate that this will assist endocrinologists and other parties who are interested in critically appraising, implementing, and enhancing The Endocrine Society’s clinical practice guidelines.


    Developing Rigorous and Helpful Clinical Practice Guidelines
 Top
 Abstract
 Introduction
 Developing Rigorous and Helpful...
 The GRADE System
 Strength of Recommendations
 Quality of the Evidence
 Values and Preferences
 Future Directions
 Conclusions
 References
 
Evidence-based medicine recognizes two principles (1). The first is that there is a hierarchy of evidence such that one is more confident about decisions based on evidence that offers greater protection against bias and random error. The second principle is that evidence alone is never sufficient to make clinical decisions. In fact, evidence-based medicine stipulates that optimal treatment decisions require integration of clinical knowledge and research evidence with patient circumstances, including their values and preferences. The rigorous application of these principles to the development of clinical practice guidelines is a relatively recent development.

Therefore, evidence-based guidelines are most helpful when they provide recommendations that are clear, based on the best available research evidence, and transparent in terms of reporting the quality of the evidence and the basis for determining the strength of the recommendations. Often this includes explicitly describing the pertinent values and preferences the guideline authors bring to bear in developing the recommendations.

For over a decade, most guideline groups have recognized that developing a summary categorization of the strength of the recommendations and the quality of the evidence supporting them, processes sometimes called grading (of the recommendation strength) and rating (of the evidence quality), helps clinicians understand a practice guideline’s summary message. Multiple systems in use produce different grading and rating categories, and rely on different letters, numbers, symbols, and terms (2). This can cause confusion while clarity is needed.

To address this concern, the Grading of Recommendations, Assessment, Development, and Evaluation (GRADE) working group, comprised of expert methodologists and guideline developers from a variety of health care organizations, set out to: 1) evaluate these different systems, 2) develop one recommended grading system, and 3) disseminate this system throughout medical communities and their literature. The challenge was great because many systems were already in place, all systems have limitations, and many organizations have spent significant resources on developing their rating system (3). GRADE’s design criteria included simplicity and applicability to a wide variety of clinical recommendations that encompass the full spectrum of patient management decisions. The GRADE working group first published their findings in 2004 (4).

Since that time, numerous organizations have adopted GRADE as their guideline grading system. These organizations include The Endocrine Society, World Health Organization, American College of Chest Physicians, UpToDate, American College of Physicians, American Thoracic Society, The Cochrane Collaboration, European Respiratory Society, Agency for Healthcare Research and Quality, and Society of Critical Care Medicine (a complete list is available on the GRADE working group web site) (5). An emerging consensus seems to be forming around the adoption of GRADE. This would be a welcome progression because such widespread adoption will help maintain clarity and consistency in guidelines across medical disciplines.

The Endocrine Society appraised the merits of the GRADE system and decided in late 2004 to adopt it as the basis for its clinical practice guidelines. The Endocrine Society was the first North American organization to adopt GRADE and use it in its Clinical Practice Guidelines program. Guidelines on the use of testosterone in men (6), on the treatment and prevention of pediatric obesity, and on the diagnosis and treatment of hirsutism are examples of the application of the GRADE system to The Endocrine Society guidelines. However, endocrinologists have not had access to a context-specific discussion of this system as it relates to guidelines in endocrinology. In the following sections, we will use endocrinology examples to illustrate how this grading system helps improve the rigor and usefulness of clinical practice guidelines.


    The GRADE System
 Top
 Abstract
 Introduction
 Developing Rigorous and Helpful...
 The GRADE System
 Strength of Recommendations
 Quality of the Evidence
 Values and Preferences
 Future Directions
 Conclusions
 References
 
The GRADE system classifies recommendations into one of two grades (strong or weak) and the quality of the evidence into one of four categories (high, moderate, low, or very low). This offers a simple and practical, yet methodologically rigorous, grading system for The Endocrine Society’s Clinical Practice Guidelines program.

To enhance further the interpretation and clarity of the recommendations, guideline developers use the terms "we recommend" to denote strong recommendations, whereas weak recommendations use the less definitive wording "we suggest." Furthermore, a strong recommendation receives a grade 1 classification, and a weak recommendation receives a grade 2 classification. The symbols chosen for the four levels of quality of evidence are: {oplus}{circ}{circ}{circ} (very low); {oplus}{oplus}{circ}{circ} (low); {oplus}{oplus}{oplus}{circ} (moderate); and {oplus}{oplus}{oplus}{oplus} (high) quality. Table 1Go provides an overview of the GRADE system and a closer look at the components of each of its recommendation categories.


View this table:
[in this window]
[in a new window]

 
TABLE 1. GRADE recommendations–a closer look

 

    Strength of Recommendations
 Top
 Abstract
 Introduction
 Developing Rigorous and Helpful...
 The GRADE System
 Strength of Recommendations
 Quality of the Evidence
 Values and Preferences
 Future Directions
 Conclusions
 References
 
The strength of a recommendation reflects the degree of confidence that the desirable effects of a recommendation outweigh the undesirable effects. Desirable effects can include beneficial health outcomes, less burden, and cost savings. Undesirable effects can include harms, more burden, and expenses. Burdens are the demands of adhering to a recommendation that patients or caregivers (e.g. their family) may dislike, such as having to take medication or the inconvenience of going to the doctor’s office. Although the degree of confidence is a continuum, the GRADE approach classifies recommendations for or against treatments into two grades, strong and weak.

If guideline developers are confident that the desirable effects of adherence to a recommendation outweigh the undesirable effects, they will make a strong recommendation within the context of a described intervention. Typically, this requires high or moderate quality evidence on patient important outcomes. Exceptionally, panels can make strong recommendations based on low to very low quality evidence. This may occur when the values and preferences guideline developers bring to bear are such that when considering even low quality evidence, they are confident that the benefits of an intervention outweigh the undesirable outcomes (or vice versa). In these cases the panel can make a strong recommendation for (or against) the intervention.

For example, consider the decision to administer aspirin or acetaminophen to children with chicken pox. Observational studies have noted an association between aspirin administration and Reye syndrome. Because aspirin and acetaminophen are, in this context, similar in their analgesic and antipyretic effects, guideline developers may make a strong recommendation for acetaminophen despite the low quality evidence suggesting harm from aspirin because they place a very high value on avoiding potential life-threatening adverse effects.

A weak recommendation is one for which a guideline panel concludes that the desirable effects of adherence to a recommendation probably outweigh the undesirable effects, but the panel is not confident. Thus, if guideline developers believe that benefits and downsides are finely balanced, or appreciable uncertainty exists about this balance, they offer a weak recommendation. Thus, low or very low quality evidence usually leads to weak recommendations because of uncertainty about the balance between risks and benefits. Guideline panels may offer weak recommendations even when high quality evidence is available when that evidence clearly demonstrates that the benefits and risks are closely balanced. For example, a guideline panel may weakly recommend bisphosphonates in relatively low-risk patients with osteopenia, in whom the burden and costs of monitoring and treatment may or may not be worth the potential reduction in the risk of fragility fractures documented in randomized trials.

Table 2Go summarizes the factors that influence the strength of a recommendation, factors that broadly correspond to: 1) certainty about the balance between benefits vs. burdens and harms, 2) resource use, and 3) variation in values and preferences. Consideration of this latter issue is key. Guideline panels will typically, either explicitly or implicitly, use their own preferences as imperfect proxies of patient values. Alternatively, they could consider the range of patients to whom the recommendation applies, and their range of values and preferences. Ideally, they will find a way to ensure that the recommendation is consistent with the values and preferences of most patients. How to achieve this goal remains a challenge; one approach includes involving relevant patients as panel members or involving patient groups able to minimize influences that could bias their judgments in the assessment of values and preferences.


View this table:
[in this window]
[in a new window]

 
TABLE 2. Factors in deciding on a strong or weak recommendation

 
There are practical implications relating the strength of recommendation for or against a therapy with patient values and preferences. For instance, a strong recommendation implies that virtually all patients, across the range of individual values found in the population, will make the same treatment decision. Strong recommendations allow clinicians to offer treatment with confidence, commonly with limited to no consideration of alternative options. Weak recommendations imply that different patients, in different clinical contexts, with different values and preferences, will likely make different choices. In the face of weak recommendations, clinicians will need to be more deliberate and judicious in explicitly incorporating evidence regarding the magnitude of benefits and risks along with patient circumstances, values, and preferences to make the best decision. In other words, with weak recommendations, the clinician will need to have a more detailed and deliberate discussion with the patient, reviewing several reasonable options. This is particularly important when clinicians and patients find their own values and preferences at odds with those the guideline panel considered in making its recommendations.

We do not know how individual clinicians can best achieve the goal of incorporating patient values and preferences in following a weak recommendation, but some promising approaches exist. For example, some clinicians are using decision aids. Decision aids are tools that help clinicians communicate to patients the relevant evidence about the available options and their relative merits in a quantitative form. Examples of these tools can be found elsewhere (for examples, see http://kerunit.e-bm.org). Randomized trials have shown that these tools can improve the quality of decision making in many clinical settings (7). Conversely, for strong recommendations, a decision aid could be an inefficient use of time and other resources; although it is plausible that having the patient participate in making treatment choices may enhance adherence to therapy (8).


    Quality of the Evidence
 Top
 Abstract
 Introduction
 Developing Rigorous and Helpful...
 The GRADE System
 Strength of Recommendations
 Quality of the Evidence
 Values and Preferences
 Future Directions
 Conclusions
 References
 
To determine the strength of the recommendations, the GRADE system explicitly considers the quality of the best available evidence identified through a comprehensive review of the literature. Study design and conduct are important determinants of quality. Randomized controlled trials (RCTs) allow decision makers to draw causal inferences linking interventions and outcomes with protection against bias. Therefore, RCTs begin with a "high" quality rating.

Because of possible limitations that fall into five categories (Table 3Go), even RCTs may not provide high quality evidence. First, there may be serious limitations in the design and conduct of RCTs (including lack of concealment and blinding, and large loss to follow-up), and these limitations would lead to a reduction in the quality of the evidence base (weakening the inference decision makers can draw from these data) and in turn a reduction in the quality level. For example, to inform guideline developers about the efficacy of physical activity on pediatric obesity, the authors reviewed the results of a metaanalysis of 20 relevant RCTs (9). These trials had no reported allocation concealment or blinding and had significant loss to follow-up (29% of studies reported greater than a 20% loss). Therefore, the guideline panel downgraded the quality of the evidence.


View this table:
[in this window]
[in a new window]

 
TABLE 3. Factors in deciding on confidence in estimates of benefits, risks, burdens, and costs

 
Second, if the results of trials are highly variable, we will have less confidence in the estimates of efficacy, and the evidence will have lower quality. For instance, RCTs of testosterone use in adult men reveal an inconsistent effect on lumbar spine bone mineral density and on libido (in trials that enrolled men with low testosterone levels) (10). These findings lower the quality of the evidence. However, in the first case, a planned subgroup analysis revealed a significant and large interaction between the route of administration and the treatment effect, explaining the inconsistency, and increasing the confidence of the guideline developers about the effect of intramuscular testosterone on lumbar spine bone mineral density (Figs. 1Go and 2Go) (10). Thus, guideline developers did not need to downgrade the quality rating for this evidence because of inconsistency. In contrast, developers downgraded the evidence linking testosterone use and libido because of unexplained and very large inconsistency (11).


Figure 1
View larger version (9K):
[in this window]
[in a new window]

 
FIG. 1. Inconsistent results. This displays random-effects metaanalysis results of eight trials of testosterone on lumbar spine bone mineral density. I2 (a statistic that reflects the proportion of variation between studies that is not due to chance, i.e. inconsistency) is 46%, which identifies substantial inconsistency. Vertical line indicates no treatment effect; squares and horizontal lines indicate point estimates and associated 95% confidence intervals (CIs) for each study. Diamonds indicate the random-effects pooled standard mean difference (SMD) with the width representing its confidence interval (10 ).

 

Figure 2
View larger version (13K):
[in this window]
[in a new window]

 
FIG. 2. Consistent results after subgroup analysis. This displays random-effects metaanalysis results of the same eight trials of testosterone on lumbar spine bone mineral density that are displayed in Fig. 1Go, but after subgroup analysis by administration route. There was no measurable inconsistency (I2 = 0%) for both subgroups. Vertical line indicates no treatment effect; squares and horizontal lines indicate point estimates and associated 95% confidence intervals (CIs) for each study. Diamonds indicate the random-effects pooled standard mean difference (SMD) with the width representing its confidence interval (10 ).

 
Third, a reduction in quality will occur when evidence supporting a recommendation is indirect. Indirectness may occur if: the patients enrolled in relevant trials differ in important ways from those under consideration by the guideline panel; the intervention or the comparator intervention tested in the trials differ in important ways (nature, dosing, duration) from those under consideration; or the outcomes differ (typically investigators will have measured effects on a substitute or surrogate outcome, rather than the patient-important outcome in which the guideline panel is primarily interested).

For example, when considering the use of testosterone gel to prevent fragility fractures in elderly hypogonadal men, evidence from trials enrolling younger men show that intramuscular testosterone can increase bone mineral density (12, 13). Here, the evidence informs the efficacy of a different testosterone formulation on a different patient group on a surrogate outcome of no importance, in and of itself, to patients (bone density rather than fracture risk); no high-quality trials have answered the question of direct relevance to the guideline developer. If a recommendation was made specific to the use of testosterone gel to prevent fractures in elderly men, the quality of the evidence would be downgraded based on indirectness with respect to the population, intervention, and outcome. Furthermore, the guideline panel interested in making recommendations about the use of testosterone for osteoporosis will have to rely only on indirect comparisons (i.e. trials of each agent against placebo but no head-to-head trials) when considering the relative merits of testosterone vs. bisphosphonates, for instance.

Fourth, guideline developers should downgrade evidence when few studies, involving few participants and, most importantly, documenting few outcomes, inform the tradeoffs of risks and benefits. As an example, a metaanalysis of the results of trials evaluating the effects of testosterone on cardiovascular outcomes suggests that testosterone does not have an effect on cardiovascular events. However, this result is based on only six trials, a total of 308 participants, and only 21 outcomes. Considering the confidence interval width, the pooled data are consistent with both a 1-fold decrease and a 4-fold increase in the odds of cardiac events in patients treated with testosterone (14). This evidence carries great uncertainty, lowering the confidence that the estimates are accurate.

Finally, guideline developers should have limited confidence when reporting bias might have affected the underlying evidence. Publication bias, one form of reporting bias, occurs because trials that show no significant effect are less likely to be published, and outcome reporting bias occurs when researchers selectively report their findings depending on their significance. Clinical trial registries may help reduce publication bias (15). Chan et al. (16) found that reporting of trial outcomes is frequently incomplete, biased, and inconsistent with the original trial protocols. Prospective public registration of trial protocols could help diminish this concern. Box 1 describes an example of reporting bias. Publication bias is more likely to take place in fields in which small trials are the norm (e.g. many endocrinopathies) because large trials are less likely to remain unpublished. Although difficult to ascertain, reporting bias is prevalent, particularly when key patient-important outcomes are only reported in a few studies.

In contrast to RCTs, observational studies start with a "low" (i.e. case-control studies, and cohort studies) or "very low" (i.e. unsystematic clinical observations, case reports and series) quality level but may be upgraded in certain situations, e.g. when the magnitude of the treatment effect is very large (e.g. use of insulin to prevent morbidity and mortality in patients with type 1 diabetes presenting in diabetic ketoacidosis; use of glucocorticoids to prevent adrenal crisis in patients with Addison’s disease). Thus, it is very important in guidelines to specify clearly the alternatives considered. Although high quality evidence, as we have seen, supports the use of glucocorticoids to prevent adrenal crises in patients with Addison’s disease, low quality evidence supports the choice of a specific glucocorticoid replacement regimen out of several in common use.

In addition, the quality level can increase when all plausible confounders would reduce the magnitude of the treatment effect, yet the effect remains sizeable. For example, a systematic review showed higher mortality in for-profit hospitals when compared with not-for-profit hospitals (17). This result occurred despite the fact that for-profit hospitals usually have additional resources available and generally admit healthier patients, factors that should work in their favor. Considering these confounders would increase the magnitude of benefit of not-for-profit hospitals (3). Table 3Go summarizes factors that influence the quality of evidence.


    Values and Preferences
 Top
 Abstract
 Introduction
 Developing Rigorous and Helpful...
 The GRADE System
 Strength of Recommendations
 Quality of the Evidence
 Values and Preferences
 Future Directions
 Conclusions
 References
 
As mentioned previously, values and preferences are essential to guidelines. The GRADE system offers insights into the role of values and preferences when it disentangles the strength of recommendations from the quality of the evidence, and encourages statements about the underlying values and preferences relevant to the recommendations.

Consider the interpretation of guidelines in the case of an individual patient. A guideline may weakly recommend (a "suggestion," using the terminology of The Endocrine Society Clinical Practice Guidelines) that patients receive treatment with a medication based on low quality evidence because there is uncertainty about the tradeoffs between potential desirable and undesirable effects. An individual patient may place a high value on potential resolution of their symptoms and a low value on avoiding possible side effects, costs, and follow-up visits and tests while taking the medication. Such a patient may prefer to take this medication, in keeping with the suggestion. Another patient in similar circumstances may have different values, placing a higher value on avoiding potential adverse effects, costs, and burdens of medical treatment.

For example, when making a decision on treatment options for the prevention of osteoporotic fractures, some experts may formulate recommendations in favor of treatment with teriparatide for women at high fracture risk. One woman may share values and preferences in keeping with this recommendation, whereas another woman, in the same situation, may find the route of administration (injection) or the cost of teriparatide unacceptable and would thus prefer not to take the medication. The use of the GRADE system, with its transparency, offers patients and clinicians the opportunity to consider and make different clinical decisions, including decisions to not use an intervention that is weakly recommended (or to use one that the guideline weakly recommends against).

The appendix (published as supplemental data on The Endocrine Society’s Journals Online web site at http://jcem.endojournals.org) offers illustrations from The Endocrine Society Clinical Practice Guidelines to highlight the issues presented here.


    Future Directions
 Top
 Abstract
 Introduction
 Developing Rigorous and Helpful...
 The GRADE System
 Strength of Recommendations
 Quality of the Evidence
 Values and Preferences
 Future Directions
 Conclusions
 References
 
GRADE does not answer all questions related to rigorous guideline development, but many areas, such as diagnostic recommendations and consideration of resource allocation, are in active development. We anticipate updating the endocrine readership when further guidance becomes available.

In regards to considering resource allocation in guidelines, there are challenges concerning the clarity, conflicts, validity, and applicability of the evidence (e.g. cost-effectiveness analyses), challenges in the interpretation and use of economic analyses to formulate guidelines (without the guidance of a health economist), and the impact of such analyses when guidelines are intended for broad, or even international, audiences. The American College of Chest Physicians has suggested an approach to this problem that is consistent with GRADE (18). The GRADE working group is preparing documents and a conference that will provide additional guidance on this topic.

There is also uncertainty as to the ideal composition of the guideline panel. Some favor broad representation, expanding from the usual set of clinical experts to include patients and health officials. However, how to select patients for participation in guidelines (e.g. highly educated patients are likely to participate actively, but they may not share values with many other patients), how to engage them into the process, and how to acknowledge their contribution is the subject of evolving science (19, 20, 21). The promise of being able to incorporate values and preferences in guideline development through direct patient consultation seems a fascinating prospect.


    Conclusions
 Top
 Abstract
 Introduction
 Developing Rigorous and Helpful...
 The GRADE System
 Strength of Recommendations
 Quality of the Evidence
 Values and Preferences
 Future Directions
 Conclusions
 References
 
Guideline development processes that are adherent to the principles of evidence-based medicine, such as the GRADE system, offer clarity, transparency, consistency, and helpfulness for academic and professional organizations seeking to provide their clinicians with practice recommendations. Further experience in the use of GRADE in endocrine guidelines and familiarity of the users with this system could enhance evidence-based endocrine practice.

Box 1. An example of reporting bias

A systematic review of the effects of testosterone on erection satisfaction and function in patients with low testosterone offers an example of reporting bias. In this review the authors found one large trial that specifically addressed this issue in addition to three smaller trials (11). However, the large trial’s results on the outcome of interest were reported only as "not significant" in the published paper; the actual data were not reported and, therefore, could not be used in a metaanalysis. Using the data from the three other trials, there was a large treatment effect noted with testosterone therapy (difference between arms of 1.3 SD values, 95% confidence interval 0.2 to 2.3). However, after obtaining the complete data on the larger trial, the new pooled treatment effect was smaller in magnitude, much less precise, and no longer significant (0.8 SD values, 95% confidence interval –0.05 to 1.63), an example of reporting bias (23).


    Acknowledgments
 
We thank the leaders and members of The Endocrine Society Task Forces who have pioneered the use of Grading of Recommendations, Assessment, Development, and Evaluation, and other efforts to formulate recommendations in endocrine practice.


    Footnotes
 
Disclosure Statement: V.M.M. receives funding from The Endocrine Society to conduct systematic reviews and metaanalyses in support of clinical practice guidelines. R.A.V. chairs the Clinical Guidelines Subcommittee of The Endocrine Society. H.J.S., R.K., G.H.G., and V.M.M. are members of the Grading of Recommendations, Assessment, Development, and Evaluation Working Group. H.J.S. is funded by a European Commission Grant (The human factor, mobility and Marie Curie Actions. Scientist Reintegration Grant IGR 42192–"GRADE"). Otherwise, the authors have nothing to disclose.

First Published Online January 2, 2008

Abbreviations: GRADE, Grading of Recommendations, Assessment, Development, and Evaluation; RCT, randomized controlled trial.

Received August 24, 2007.

Accepted December 21, 2007.


    References
 Top
 Abstract
 Introduction
 Developing Rigorous and Helpful...
 The GRADE System
 Strength of Recommendations
 Quality of the Evidence
 Values and Preferences
 Future Directions
 Conclusions
 References
 

  1. Guyatt GH, Haynes B, Jaeschke R, Cook D, Greenhalgh T, Meade M, Green L, Naylor C, Wilson M, McAlister FA, Richardson W, Montori V, Bucher H 2002 Introduction: the philosophy of evidence-based medicine. In: Guyatt GH, Rennie D, eds. Users’ guides to the medical literature: a manual of evidence-based clinical practice. Chicago: AMA Press; 121–140
  2. Schunemann HJ, Best D, Vist G, Oxman AD 2003 Letters, numbers, symbols and words: how to communicate grades of evidence and recommendations. CMAJ [Erratum (2004) 170:1082] 169:677–680
  3. Guyatt G, Vist G, Falck-Ytter Y, Kunz R, Magrini N, Schunemann H 2006 An emerging consensus on grading recommendations? ACP J Club 144:A8–A9
  4. Atkins D, Best D, Briss PA, Eccles M, Falck-Ytter Y, Flottorp S, Guyatt GH, Harbour RT, Haugh MC, Henry D, Hill S, Jaeschke R, Leng G, Liberati A, Magrini N, Mason J, Middleton P, Mrukowicz J, O’Connell D, Oxman AD, Phillips B, Schunemann HJ, Edejer TT, Varonen H, Vist GE, Williams Jr JW, Zaza S 2004 Grading quality of evidence and strength of recommendations. BMJ 328:1490
  5. GRADE working group 2007 Organizations. Available at: http://www.gradeworkinggroup.org/society/index.htm. Accessed May 8, 2007
  6. Bhasin S, Cunningham GR, Hayes FJ, Matsumoto AM, Snyder PJ, Swerdloff RS, Montori VM 2006 Testosterone therapy in adult men with androgen deficiency syndromes: an endocrine society clinical practice guideline. J Clin Endocrinol Metab 91:1995–2010[Abstract/Free Full Text]
  7. O’Connor AM, Stacey D, Entwistle V, Llewellyn-Thomas H, Rovner D, Holmes-Rovner M, Tait V, Tetroe J, Fiset V, Barry M, Jones J 2003 Decision aids for people facing health treatment or screening decisions. Cochrane Database Syst Rev (2):CD001431
  8. Weymiller AJ, Montori VM, Jones LA, Gafni A, Guyatt GH, Bryant SC, Christianson TJ, Mullan RJ, Smith SA 2007 Helping patients with type 2 diabetes mellitus make treatment decisions: statin choice randomized trial. Arch Intern Med 167:1076–1082[Abstract/Free Full Text]
  9. McGovern L, Johnson JN, Paulo R, Hettinger A, Singhal V, Kamath C, Erwin PJ, Montori VM Treatment of pediatric obesity. A systematic review and metaanalysis of randomized trials. J Clin Endocrinol Metab, in press
  10. Tracz MJ, Sideras K, Bolona ER, Haddad RM, Kennedy CC, Uraga MV, Caples SM, Erwin PJ, Montori VM 2006 Testosterone use in men and its effects on bone health. A systematic review and meta-analysis of randomized placebo-controlled trials. J Clin Endocrinol Metab 91:2011–2016[Abstract/Free Full Text]
  11. Bolona ER, Uraga MV, Haddad RM, Tracz MJ, Sideras K, Kennedy CC, Caples SM, Erwin PJ, Montori VM 2007 Testosterone use in men with sexual dysfunction: a systematic review and meta-analysis of randomized placebo-controlled trials. Mayo Clin Proc 82:20–28[Abstract/Free Full Text]
  12. Behre HM, Kliesch S, Leifke E, Link TM, Nieschlag E 1997 Long-term effect of testosterone therapy on bone mineral density in hypogonadal men. J Clin Endocrinol Metab 82:2386–2390[Abstract/Free Full Text]
  13. Katznelson L, Finkelstein JS, Schoenfeld DA, Rosenthal DI, Anderson EJ, Klibanski A 1996 Increase in bone density and lean body mass during testosterone administration in men with acquired hypogonadism. J Clin Endocrinol Metabol 81:4358–4365[Abstract]
  14. Haddad RM, Kennedy CC, Caples SM, Tracz MJ, Bolona ER, Sideras K, Uraga MV, Erwin PJ, Montori VM 2007 Testosterone and cardiovascular risk in men: a systematic review and meta-analysis of randomized placebo-controlled trials. Mayo Clin Proc 82:29–39[Abstract/Free Full Text]
  15. Laine C, Horton R, DeAngelis CD, Drazen JM, Frizelle FA, Godlee F, Haug C, Hebert PC, Kotzin S, Marusic A, Sahni P, Schroeder TV, Sox HC, Van der Weyden MB, Verheugt FW 2007 Clinical trial registration–looking back and moving ahead. N Engl J Med 356:2734–2736[Free Full Text]
  16. Chan AW, Hrobjartsson A, Haahr MT, Gotzsche PC, Altman DG 2004 Empirical evidence for selective reporting of outcomes in randomized trials: comparison of protocols to published articles. JAMA 291:2457–2465[Abstract/Free Full Text]
  17. Devereaux PJ, Choi PT, Lacchetti C, Weaver B, Schunemann HJ, Haines T, Lavis JN, Grant BJ, Haslam DR, Bhandari M, Sullivan T, Cook DJ, Walter SD, Meade M, Khan H, Bhatnagar N, Guyatt GH 2002 A systematic review and meta-analysis of studies comparing mortality rates of private for-profit and private not-for-profit hospitals. CMAJ 166:1399–1406[Abstract/Free Full Text]
  18. Guyatt G, Baumann M, Pauker S, Halperin J, Maurer J, Owens DK, Tosteson AN, Carlin B, Gutterman D, Prins M, Lewis SZ, Schunemann H 2006 Addressing resource allocation issues in recommendations from clinical practice guideline panels: suggestions from an American College of Chest Physicians task force. Chest 129:182–187[CrossRef][Medline]
  19. Fretheim A, Schunemann HJ, Oxman AD 2006 Improving the use of research evidence in guideline development: 5. Group processes. Health Res Policy Syst 4:17
  20. Fretheim A, Schunemann HJ, Oxman AD 2006 Improving the use of research evidence in guideline development: 3. Group composition and consultation process. Health Res Policy Syst 4:15
  21. Schunemann HJ, Fretheim A, Oxman AD 2006 Improving the use of research evidence in guideline development: 10. Integrating values and consumer involvement. Health Res Policy Syst 4:22
  22. Schunemann HJ, Jaeschke R, Cook DJ, Bria WF, El-Solh AA, Ernst A, Fahy BF, Gould MK, Horan KL, Krishnan JA, Manthous CA, Maurer JR, McNicholas WT, Oxman AD, Rubenfeld G, Turino GM, Guyatt G 2006 An official ATS statement: grading the quality of evidence and strength of recommendations in ATS guidelines and recommendations. Am J Respir Crit Care Med 174:605–614[Free Full Text]
  23. Sinha MK, Montori VM 2006 Reporting bias and other biases affecting systematic reviews and meta-analyses: a methodological commentary. Expert Rev Pharmacoeconomics Outcomes Res 6:603–611[CrossRef]



This article has been cited by other articles:


Home page
Mayo Clin Proc.Home page
R. E. Johnson and M. H. Murad
Gynecomastia: Pathophysiology, Evaluation, and Management
Mayo Clin. Proc., November 1, 2009; 84(11): 1010 - 1015.
[Abstract] [Full Text] [PDF]


Home page
J. Clin. Endocrinol. Metab.Home page
W. C. Hembree, P. Cohen-Kettenis, H. A. Delemarre-van de Waal, L. J. Gooren, W. J. Meyer III, N. P. Spack, V. Tangpricha, and V. M. Montori
Endocrine Treatment of Transsexual Persons:An Endocrine Society Clinical Practice Guideline
J. Clin. Endocrinol. Metab., September 1, 2009; 94(9): 3132 - 3154.
[Abstract] [Full Text] [PDF]


Home page
PediatricsHome page
A. Maiorana and S. Cianfarani
Impact of Growth Hormone Therapy on Adult Height of Children Born Small for Gestational Age
Pediatrics, September 1, 2009; 124(3): e519 - e531.
[Abstract] [Full Text] [PDF]


Home page
J. Clin. Endocrinol. Metab.Home page
P. E. Cryer, L. Axelrod, A. B. Grossman, S. R. Heller, V. M. Montori, E. R. Seaquist, and F. J. Service
Evaluation and Management of Adult Hypoglycemic Disorders: An Endocrine Society Clinical Practice Guideline
J. Clin. Endocrinol. Metab., March 1, 2009; 94(3): 709 - 728.
[Abstract] [Full Text] [PDF]


Home page
J. Clin. Endocrinol. Metab.Home page
G. P. August, S. Caprio, I. Fennoy, M. Freemark, F. R. Kaufman, R. H. Lustig, J. H. Silverstein, P. W. Speiser, D. M. Styne, and V. M. Montori
Prevention and Treatment of Pediatric Obesity: An Endocrine Society Clinical Practice Guideline Based on Expert Opinion
J. Clin. Endocrinol. Metab., December 1, 2008; 93(12): 4576 - 4599.
[Abstract] [Full Text] [PDF]


Home page
J. Clin. Endocrinol. Metab.Home page
J. L. Rosenzweig, E. Ferrannini, S. M. Grundy, S. M. Haffner, R. J. Heine, E. S. Horton, and R. Kawamori
Primary Prevention of Cardiovascular Disease and Type 2 Diabetes in Patients at Metabolic Risk: An Endocrine Society Clinical Practice Guideline
J. Clin. Endocrinol. Metab., October 1, 2008; 93(10): 3671 - 3689.
[Abstract] [Full Text] [PDF]


Home page
J. Clin. Endocrinol. Metab.Home page
J. W. Funder, R. M. Carey, C. Fardella, C. E. Gomez-Sanchez, F. Mantero, M. Stowasser, W. F. Young Jr., and V. M. Montori
Case Detection, Diagnosis, and Treatment of Patients with Primary Aldosteronism: An Endocrine Society Clinical Practice Guideline
J. Clin. Endocrinol. Metab., September 1, 2008; 93(9): 3266 - 3281.
[Abstract] [Full Text] [PDF]


Home page
J. Clin. Endocrinol. Metab.Home page
L. K. Nieman, B. M. K. Biller, J. W. Findling, J. Newell-Price, M. O. Savage, P. M. Stewart, and V. M. Montori
The Diagnosis of Cushing's Syndrome: An Endocrine Society Clinical Practice Guideline
J. Clin. Endocrinol. Metab., May 1, 2008; 93(5): 1526 - 1540.
[Abstract] [Full Text] [PDF]


Home page
J. Clin. Endocrinol. Metab.Home page
K. A. Martin, R. J. Chang, D. A. Ehrmann, L. Ibanez, R. A. Lobo, R. L. Rosenfield, J. Shapiro, V. M. Montori, and B. A. Swiglo
Evaluation and Treatment of Hirsutism in Premenopausal Women: An Endocrine Society Clinical Practice Guideline
J. Clin. Endocrinol. Metab., April 1, 2008; 93(4): 1105 - 1120.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Supplemental Data
Right arrow Submit a related Letter to the Editor
Right arrow Alert me when this article is cited
Right arrow Alert me when eLetters are posted
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow Request Copyright Permission
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Swiglo, B. A.
Right arrow Articles by Montori, V. M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Swiglo, B. A.
Right arrow Articles by Montori, V. M.
Related Collections
Right arrow Adrenal and Hypertension
Right arrow Neuroendocrinology and Pituitary
Right arrow Pediatric Endocrinology
Right arrow Thyroid
Right arrow Other
Right arrow Lipid
Right arrow Autoimmunity
Right arrow Calcium and Bone Metabolism
Right arrow Cardiovascular Endocrinology
Right arrow Diabetes and Insulin
Right arrow Endocrine Oncology
Right arrow Female Endocrinology
Right arrow Male Endocrinology
Right arrow Metabolism
Right arrow Obesity


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
Endocrinology Endocrine Reviews J. Clin. End. & Metab.
Molecular Endocrinology Recent Prog. Horm. Res. All Endocrine Journals