| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
Father Sean OSullivan Research Centre (L.P.R., A.O., L.T.) St. Josephs Healthcare Hamilton, Hamilton Ontario, L8N 4A6 Canada; and Department of Clinical Epidemiology and Biostatistics (A.O., L.T.) and Department of Medicine (L.P.R.) and Bachelor of Health Sciences Program (M.O.M., M.O.R.), Faculty of Health Science, McMaster University, Hamilton, Ontario, L8N 3Z5 Canada
Address all correspondence and requests for reprints to: Lehana Thabane, Biostatistics/FSORC, 3rd Floor Martha, Room H325, St. Josephs Healthcare Hamilton, 50 Charlton Avenue East, Hamilton, Ontario, L8N 4A6 Canada. E-mail: ThabanL{at}mcmaster.ca.
| Abstract |
|---|
|
|
|---|
Objective: Our aim was to assess the reporting quality of RCTs in general endocrinology. A secondary objective was to identify predictors for better reporting quality.
Design and Setting: We systematically reviewed RCTs published in three general endocrinology journals between January 2005 and December 2006.
Participants: We included parallel-design RCTs that addressed a question of treatment or prevention. Article selection and data abstraction were conducted by two reviewers independently, and disagreements were resolved by consensus.
Main Outcomes: There were two main outcomes: 1) a 15-point overall reporting quality score (OQS) based on the Consolidated Standards for Reporting Trials (CONSORT); and 2) a 3-point key score, based on allocation concealment, blinding, and use of intention-to-treat analysis.
Results: Eighty nine RCTs were included. The median OQS was 10 (interquartile range = 2). Allocation concealment, blinding, and analysis by intention to treat were reported in 10, 20, and 16 of the 89 RCTs, respectively. A multivariable regression analysis showed that complete industrial funding [incidence rate ratio (IRR) = 1.014; 95% confidence interval (CI), 1.010–1.018], journal of publication (IRR = 1.068; 95% CI, 1.007–1.132), and sample size (IRR = 1.048; 95% CI, 1.026–1.070) were significantly associated with a slightly better OQS.
Conclusions: The quality of RCT reporting in general endocrine literature is suboptimal. We discuss our results, highlight the areas where improvements are needed, and provide some recommendations.
| Introduction |
|---|
|
|
|---|
Usually, the RCT report is the only available evidence for the healthcare provider and researcher to appraise the quality of the design, conduct, and analysis of the RCT. Therefore, rigorous methodological quality of an RCT should be reflected in the quality of its report. However, in the 1990s, significant shortcomings in the quality of RCT reporting were recognized in a wide variety of journals on general medicine, internal medicine, and specializations (3, 4, 5, 6, 7, 8, 9, 10). Moreover, low RCT reporting quality was shown to be associated with overestimation of the efficacy of interventions by as much as 40% (2). Such overestimation could lead to introduction of therapies that are actually less effective or even ineffective.
In response to this problem, the Consolidated Standards of Reporting Trials (CONSORT) group developed the CONSORT statement in 1996 and published a revised version in 2001 (11, 12, 13). The CONSORT proposes specific guidelines for reporting RCTs, which comprises a 22-item checklist and a participant flow diagram. The statement has been translated into 10 languages and endorsed by prominent general medical journals, many specialty medical journals, and leading editorial organizations. Subsequent to the publication of CONSORT, studies have continued to document a suboptimal quality of RCT reporting in leading medical journals and, particularly, in specialist journals (14, 15, 16, 17, 18, 19, 20, 21). On the other hand, journal adoption of CONSORT has been associated with improved reporting of RCTs (15, 22, 23, 24, 25).
To the best of our knowledge, there are no published studies on the quality of RCT reporting in endocrinology journals. Taking into account the accumulated evidence in other areas of medicine and the fact that general endocrinology journals have not yet formally endorsed the CONSORT statement, we considered it particularly relevant to explore and quantify the quality of RCT reporting in this field. Therefore, we conducted this observational study to appraise the quality of RCT reporting in the general endocrine literature. Our aims were 1) to assess the overall reporting quality of published articles using a standardized tool based on the CONSORT statement and identify the most frequent methodological problems; 2) to specifically evaluate the reporting quality of three key methodological factors that safeguard against bias, i.e. allocation concealment, blinding, and analysis by intention-to-treat principle; and 3) to investigate whether specific study characteristics, such as journal of publication, funding source, sample size, disease, and type of treatment, are associated with better reporting quality.
| Materials and Methods |
|---|
|
|
|---|
We selected the three general endocrinology journals with the highest impact factor in 2006 as our source of RCTs. These are The Journal of Clinical Endocrinology and Metabolism (JCEM) (impact factor = 5.8), Clinical Endocrinology (impact factor = 3.4), and the European Journal of Endocrinology (impact factor = 3.1). Two investigators (L.R. and A.O.) independently performed a manual search of each published issue of the three journals between January 2005 and December 2006. The abstracts were reviewed in duplicate, and all the articles considered eligible RCTs by either reviewer were further evaluated in full text (Fig. 1
). A study was defined as an RCT if the assignment of participants to interventions was described by words such as randomly allocated, assigned at random, or allocated by randomization, and if a control group was present. The control group could be placebo, another treatment, a different dose of the same treatment, usual care, or just no treatment.
|
Data extraction
An electronic standardized data collection form was used to extract data from each article. Two pairs of trained reviewers (M.M./L.R. and A.O./M.R.), blinded to each others ratings, abstracted data independently. The form was previously pilot tested and revised. Any disagreement was resolved through consensus, and chance-adjusted inter-rater agreements were calculated.
Rating of overall reporting quality
Given that we defined quality of reporting as the extent to which the rationale, method, conduct, and results of the trial are reported, we adopted 15 relevant items from the revised CONSORT statement for our appraisal. These items were chosen because lack of their reporting has been associated with a higher level of bias (11). The CONSORT discussion section items were excluded because we considered them too subjective to evaluate. Three key methodological qualities were also excluded for a separate assessment. Thus, an overall quality score (OQS) with 15 items was constructed. Each item was scored 1 if it was reported and 0 if it was not clearly stated or definitely not stated. The scale thus had a possible score between 0 and 15. Two additional items that are important to be present in any RCT report were investigated; these are the provision of a participant flow chart according to the CONSORT recommendations and an ethical item (11, 26).
Rating of key methodological items
Concealment of allocation, appropriate blinding, and analysis according to intention-to-treat principle were assessed separately because they are highly important for avoiding bias and distortions of the effect estimates (2, 27). Moreover, these domains are often underreported even in studies with high OQS (28).
Allocation concealment is the masking of the upcoming treatment assignments from the individuals in charge of patient enrollment and treatment allocation. The lack of allocation concealment permits selective assignment of patients by manipulation of either the sequence of treatments to be allocated or the sequence of patients to be enrolled, destroying the main purpose of randomization, which is to avoid selection bias. Allocation concealment was considered appropriate in this study if one of the following allocation methods was reported: 1) centralized randomization; 2) numbered, coded vehicles; and 3) opaque, sealed, and sequentially numbered envelopes. Inadequate methods concerned open or predictable sequences of allocation, such as alternation, date of birth, case record number, and open tables of random numbers. These criteria are similar to those recommended for Cochrane reviews (29).
In the assessment of blinding, the often-used terms of single blind or double blind were considered insufficient for readers to determine who was blinded. These terms are ambiguous and can have different meanings to different readers (30). In fact, blinding can occur at several different levels: patients, heathcare providers, data collectors, outcome assessors, data analysts, and even manuscript writers (31). Although blinding of patients and treating physicians is sometimes not possible, blinding of at least one other group such as data collectors, outcome assessors, or data analysts is always feasible. In RCTs involving radioiodine therapy, training or dieting programs or other fiscal interventions like surgery or laser photocoagulation for thyroid nodules, blinding of some groups may be quite difficult or unfeasible. In these cases, we considered blinding to have occurred if at least one specific group was explicitly reported as blinded. In trials where blinding feasibility was not a problem, at least two groups must have been explicitly reported as blinded to qualify as a blinded study.
The meaning of intention-to-treat analysis has been interpreted in different ways by different investigators (32, 33, 34). For the purpose of this study, we adopted the most common and also the strictest definition. Intention to treat was defined as the inclusion of all patients randomly assigned in the analysis, regardless of whether they actually satisfied the entry criteria, the treatment actually received, and subsequent withdrawal or protocol deviations.
Each of the three key methodological domains was scored 1 point if the method was appropriate and 0 points if it was inappropriate or not clearly reported. A combined key methodological index score was calculated for each trial by adding the scores of the three domains (possible range, 0–3). An inter-rater agreement was calculated for each key methodology.
Definition of predictor variables
Sample size was defined as the number of patients randomized in each trial. The journal of publication refers to the journal where each trial was published. Disease corresponds to the condition on treatment in each trial. Complete funding by industry was defined as industry being the only funding source reported by a trial.
Hypotheses
Impact factor, sample size, and industry funding have been associated with better reporting quality in previous studies (14, 20). Therefore, we hypothesized that studies published in JCEM, those completely funded by the industry, and those with a larger sample size will be associated with a better reporting quality. We expected JCEM to have better reporting quality because this journal has the highest impact factor among the three journals included in this study. In addition, as research on diabetes is frequently funded by industry, we also examined the association between diabetes and reporting quality.
Statistical analysis
The percentage of trials that scored yes on each overall quality item and on each key methodological item and the associated 95% confidence interval was calculated. Chance-adjusted inter-rater agreements were calculated using the Cohens
statistics. Agreement was judged as poor (
0.2), fair (0.21
0.4), moderate (0.41
0.6), substantial (0.61
0.8), or good (
> 0.8) (23).
Categorical data are reported as number and percentages. Continuous data are expressed as mean and SD or median and interquartile range (IQR).
To identify variables associated with overall reporting quality, we conducted a multivariable regression analysis with OQS as outcome variable. Because our sample comes from three journals, which likely differ in methodological reporting quality, we assumed correlation in manuscript quality within each journal. Thus, the regression analysis was performed using generalized estimated equations to account for this plausible clustering effect (29). Within-journal correlation was modeled using an exchangeable working correlation matrix. Statistical inference on each independent variable was based on the Sandwich estimate of variance, which is robust to mis-specification of within-cluster correlation. Rating scores are nonnegative counts; thus, we assumed the Poisson distribution for outcomes in generalized estimated equations. Variables were considered statistically significant at
= 0.05 in the multivariable analysis. For studying predictors of the reporting of key methodological factors, the same procedure was performed using the key score as outcome variable. Statistical analyses were conducted using SAS 9.0 (Cary, NC) and STATA 9.0 (College Station, TX).
| Results |
|---|
|
|
|---|
The RCT selection process is outlined in Fig. 1
. The inter-rater agreement
for article selection was 0.92 [95% confidence interval (CI), 0.86–0.97]. A total of 89 articles were included for abstraction. Table 1
shows the main characteristics of the trials. Almost 80% of the RCTs were published by one journal. The most common diseases were metabolic bone disease, fertility problems, and thyroid disease. The most frequent type of intervention was drug therapy. The funding source was reported in 78% of the articles. Funding came from two or more different sources in 32.6% of the RCTs. Most RCTs were small, with the median sample size being 69 individuals (IQR = 75). There were two relatively large trials with 1471 and 5091 participants, respectively.
|
The ratings of overall quality factors are shown in Table 2
. Some of the factors that make up the OQS were consistently well reported (items 1, 2, 3, 10, and 13), but others were poorly reported in the majority of the articles. Five of the 15 items of the OQS were properly reported in less than 40% of the studies. These items are outcome definitions, sample size calculation, method used to generate the randomization sequence, implementation of the randomization process, and statement of the length of the recruitment and follow-up periods. The mean OQS was 10.0 (SD = 2.03), and the median was 10 (IQR = 2), with the minimum and maximum score being 5 and 15, respectively.
|
Rating of key methodological factors
For the rating of the key methodologies, the
inter-rater agreement was 0.55 (95% CI, 0.25–0.84) for allocation concealment, 0.62 (95% CI, 0.40–0.83) for intention-to-treat analysis, and 0.65 (95% CI, 0.46–0.84) for blinding. The percentage of articles that reported each key methodology is provided in Table 3
. Among the 89 studies, 51 (57%) did not report any of the three key methodological items, and only one article reported the three items. The reporting of key methodologies among studies with OQS above the 75th percentile was also poor. Only 28.6, 19, and 28.6% of the articles in the top quartile of the OQS describe an appropriate allocation concealment method, intention-to-treat analysis, and blinding, respectively. Consequently, good OQS (i.e.
12) does not guarantee good report of key methodologies.
|
Only 20 of the 89 RCTs (22%) were blinded according to the study definition. Among the 84 studies where blinding of patients was considered feasible, only 20 studies (23%) reported specifically at least two blinded groups. Table 4
shows the blinding rates by groups involved in these 84 RCTs. Among the five studies where blinding of patients was considered not feasible, none reported blinding of another group.
|
Table 5
displays the results of the multivariable analysis of factors associated with the OQS and each categorys median OQS. The regression model shows that complete industry funding and publication in JCEM were significantly associated with a 1.4 and 6.8% increase in the OQS, respectively. In addition, a 1-unit increment in the log scale in sample size was significantly associated with a 4.8% increase in the OQS. The multivariable analysis of factors associated with better reporting of key methodologies did not show a significant association for any of these variables. The variable diabetes was not included in the regression analyses because there were only six trials on this topic. The overall reporting quality was lower for diabetes trials (median OQS = 8) than for other trials (median OQS = 10). The median key scores did not differ between diabetes and other diseases.
|
| Discussion |
|---|
|
|
|---|
The CONSORT initiative was intended to improve the reporting quality of RCTs. The 2001 CONSORT Statement provides guidelines for reporting results of parallel design RCTs (13). Variants of the CONSORT were subsequently written to provide guidelines for reporting results from other RCT designs. The CONSORT website (www.consort-statement.org) provides updated information as new recommendations emerge. The effectiveness of CONSORT in improving the reporting quality of RCTs has been widely evaluated. A recent systematic review analyzed eight studies comparing the RCT reporting quality in CONSORT adopter journals before and after the CONSORT publication and the reporting quality between CONSORT adopters and nonadopters. Overall, this review showed CONSORT adoption is associated with some improvement in the reporting quality of RCTs (22). However, the magnitude of improvement varied considerably among included studies. A possible explanation for this variability is the lack of consistency in enforcing the use of the CONSORT checklist among CONSORT adopter journals. Mills et al. (25), in an analysis of general medical and specialist journals that endorse CONSORT, recently found that reporting was not enforced consistently, with specialty journals lagging behind general medical journals. Therefore, we recommend that endocrinology journals adopt CONSORT and enforce its use by requiring authors to submit a completed CONSORT checklist with their manuscripts.
We identified a statistically significant association between overall reporting quality and sample size, complete industry funding, and publication in JCEM. However, only some proportion of the variation in the OQS between studies can be explained by these variables. Other variables such as awareness of the CONSORT statement by authors, adoption of CONSORT by journals, and availability of advice from a methodological expert when planning an RCT could be stronger predictors for reporting quality. Testing those hypotheses, however, was out of the scope of our study.
There are several limitations to our study. We did not measure RCT methodological quality directly, because we did not verify the information from the authors or their protocols. Important methodological detail may be omitted from published reports. Therefore, the quality of reporting should be taken only as a surrogate of true methodological quality. For example, Devereaux et al. (37) found that authors of RCTs frequently used allocation concealment and blinding, despite the failure to report these methods. Similarly, Pildal et al. (38) found that approximately 40% of RCT reports with unclear allocation concealment had adequate concealment according to their protocols. Nevertheless, because the report is usually the only source for clinicians and other researchers to judge the validity and generalizability of the results, the quality of report has an important value by itself. Another limitation is that our reporting quality scores are not validated, nor is there any validated tool to assess the reporting quality of RCTs. Furthermore, there is no agreement on what is the best way to evaluate methodological quality of RCTs. There are more than 25 different quality assessment scales, but most of them have not been rigorously developed or tested for validity and reliability (39). In addition, the type of scale used to assess trial quality has been shown to substantially change the interpretation of metaanalyses of RCTs (40). The exclusion of crossover design trials could also be considered a limitation. We excluded them because our OQS tool is based on the revised 2001 CONSORT statement, which was intended only for trials that used a parallel group design. Finally, the inclusion of only general endocrinology journals may affect the generalizability of our results. Our findings may not represent the quality of reporting of RCTs published in other endocrinology journals. Evaluation of reporting of trials in areas such as osteoporosis and diabetes still require investigation. Despite these limitations, we think our results have good internal validity. We developed a standardized evaluation instrument, and the selection and abstraction processes were independently performed by two qualified reviewers. Disagreements were not uncommon, and they occurred often due to lack of transparency or contradictory information of the reports. Nevertheless, raters achieved a substantial degree of concordance beyond chance for most criteria, lending internal validity to our results.
In conclusion, our study findings show the reporting quality of RCTs in general endocrinology is suboptimal. The knowledge gained from this study should be taken as an opportunity for improvement and to increase awareness and open debate on this issue. We recommend that endocrinology journals endorse CONSORT. This would help researchers in endocrinology to improve the planning and reporting of their future RCTs and have a direct benefit on the clinical application of their work results. A future evaluation of the reporting quality after CONSORT endorsement would be useful in assessing the effectiveness of this measure.
| Acknowledgments |
|---|
| Footnotes |
|---|
Disclosure Statement: The authors have nothing to disclose.
First Published Online June 26, 2008
Abbreviations: CI, Confidence interval; CONSORT, Consolidated Standards for Reporting Trials; IQR, interquartile range; OQS, overall quality score; RCT, randomized controlled trial.
Received April 15, 2008.
Accepted June 12, 2008.
| References |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
D. C. Bauer Randomized Trial Reporting in General Endocrine Journals: The Good, the Bad, and the Ugly J. Clin. Endocrinol. Metab., October 1, 2008; 93(10): 3733 - 3734. [Full Text] [PDF] |
||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| Endocrinology | Endocrine Reviews | J. Clin. End. & Metab. |
| Molecular Endocrinology | Recent Prog. Horm. Res. | All Endocrine Journals |