| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
John H. Stroger Jr. Hospital of Cook County and Rush Medical College (R.K., A.T.E., C.V.V.), Chicago, Illinois 60612; Walsall Manor Hospital National Health Service Trust (T.A.M.A.), Walsall, WS2 9PS United Kingdom; University of Milan and Istituto di Ricovero e Cura a Carattere Scientifico (IRCCS) Policlinico San Donato (B.A.), San Donato Milanese 20097, Italy; Regional Centre for Endocrinology and Diabetes (A.B.A., C.H.C.), Royal Victoria Hospital, Belfast BT12 6BA, United Kingdom; Queen Elizabeth Hospital (C.H.C.), Hong Kong Special Administrative Region; Keele University and North Staffordshire Hospital (R.N.C.), Stoke-on-Trent ST4 7QB,United Kingdom; Hacettepe University (E.N.G.), 06100 Ankara, Turkey; IRCCS G. Gaslini (M.M.), University of Genova, 5-16147 Genova, Italy; Cincinnati Childrens Hospital Medical Center and University of Cincinnati (S.R.R.), Cincinnati, Ohio 45229; Christchurch Hospital (S.G.S.), Christchurch, New Zealand; and Institute of Endocrinology, Metabolism, and Hypertension (K.T.), Tel Aviv Sourasky Medical Center, Tel Aviv 64239, Israel
Address all correspondence and requests for reprints to: Rasa Kazlauskaite, M.D., M.Sc., 1900 West Polk Street, Room 806, Chicago, Illinois 60612. E-mail: rasa_kazlauskaite{at}rush.edu.
| Abstract |
|---|
|
|
|---|
Objective: Our objective was to compare standard-dose and low-dose corticotropin tests for diagnosing HPAI.
Data Sources: We searched the PubMed database from 1966–2006 for studies reporting diagnostic value of standard-dose or low-dose corticotropin tests, with patient-level data obtained from original investigators.
Study Selection: Eligible studies had more than 10 patients. All subjects were evaluated because of suspicion for chronic HPAI, and patient-level data were available. We excluded studies with no accepted reference standard for HPAI (insulin hypoglycemia or metyrapone test) if test subjects were in the intensive care unit or if only normal healthy subjects were used as controls.
Data Extraction: We constructed receiver operator characteristic (ROC) curves using patient-level data from each study and then merged results to create summary ROC curves, adjusting for study size and cortisol assay method. Diagnostic value of tests was measured by calculating area under the ROC curve (AUC) and likelihood ratios.
Data Synthesis: Patient-level data from 13 of 23 studies (57%; 679 subjects) were included in the metaanalysis. The AUC were as follows: low-dose corticotropin test, 0.92 (95% confidence interval 0.89–0.94), and standard-dose corticotropin test, 0.79 (95% confidence interval 0.74–0.84). Among patients with paired data (seven studies, 254 subjects), diagnostic value of low-dose corticotropin test was superior to standard-dose test (AUC 0.94 and 0.85, respectively; P < 0.001).
Conclusions: Low-dose corticotropin test was superior to standard-dose test for diagnosing chronic HPAI, although it has technical limitations.
| Introduction |
|---|
|
|
|---|
The reference tests for establishing the integrity of the HPA axis require assessing the response to either a strong stimulus (e.g. insulin-induced hypoglycemia) or an interruption of the negative feedback from cortisol (overnight metyrapone test). However, these reference tests have major drawbacks. The insulin tolerance test is contraindicated in the elderly and those with a history of seizures or cardiovascular disease and requires continuous physician supervision to monitor for serious adrenergic or neurological symptoms (1). The overnight metyrapone test carries a risk of adrenal crisis, and errors can occur from other drugs affecting metyrapone clearance (2). Thus, there is a great clinical demand for alternative tests that are quicker, cheaper, and safer.
The rationale for using the corticotropin analog stimulation test is the assumption that in chronic endogenous corticotropin deficiency, acute responsiveness of the adrenal zona fasciculata is diminished and fails to mount an adequate cortisol response (3). We examined the published literature for evidence on two corticotropin analog stimulation tests using either a standard (250 µg) or low (1 µg) dose of the corticotropin analog. The primary objective of our quantitative metaanalysis was to compare the performance of the standard-dose corticotropin stimulation test (SDCT) and low-dose corticotropin stimulation test (LDCT) in diagnosing HPAI (defined by results of the insulin tolerance test or overnight metyrapone test). The second objective was to determine how best to incorporate the tests in clinical practice, especially in relation to the basal cortisol level.
| Materials and Methods |
|---|
|
|
|---|
We searched the PubMed (www.PubMed.gov) database from 1966–2006 for articles with keywords adrenal insufficiency and diagnosis and limited the search to human studies published in the English language. We selected studies with at least 10 subjects with suspected HPAI and required that the disease be verified with either the insulin tolerance test or metyrapone test. We then tried to contact the principal investigator of each relevant study to request their patient-level data on the following variables: results of the integrated HPA axis reference test, baseline cortisol value, and cortisol values after the SCDT and LCDT.
Population characteristics
To be eligible for inclusion, adult and pediatric subjects had to be suspected of HPAI from disease or injury to the pituitary or hypothalamus or from prolonged exogenous glucocorticoid administration in supraphysiological doses. Patients had to be affected by hypothalamic-pituitary disease for at least 4 wk to exclude acute hypothalamic or pituitary disorders. A normal sleep-wake cycle was required (or assumed, if no information). We did not include data from studies done in critical care settings.
In actual clinical practice, testing for HPAI is performed when there is some suspicion for HPAI. We have therefore investigated the performance of the tests in this at-risk population to avoid the problem of spectrum bias (4), which occurs when tests are evaluated among patients who are different from the ones who will be tested in practice. Thus, we excluded from analysis those subjects who were described as normal healthy control subjects (those recruited as healthy volunteers and not due to suspected pituitary disease based on signs/symptoms or imaging).
The most common reason for testing (43%) was because of a pituitary macroadenoma (before or after surgical or radiation treatment). The second most common reason was treatment of other intracranial tumors (23%); 14% of patients were tested due to pituitary disease other than pituitary macroadenoma; 13% were tested due to other pituitary hormone deficiencies (GH or panhypopituitarism). Prolonged supraphysiological glucocorticoid treatment prompted HPAI evaluation in 7% of total patients, but in only eight patients were there paired data on LDCT and SDCT (5, 6, 7). The only study (5) that tested patients solely due to suspected glucocorticoid-induced HPAI was analyzed separately [see Fig. 1 in Kane et al. (6)], but this study did not have paired SDCT and LDCT data.
Reference test
The diagnosis of HPAI was based on an abnormal response to one of the two reference standards for evaluating the integrity of the HPA axis: insulin tolerance test or overnight metyrapone test. We relied on individual study investigators to correctly dichotomize the reference test results into HPAI present or absent.
Cortisol assay
Cortisol assays are not standardized and vary across hospitals and studies (8, 9, 10). The cortisol assay methods used in various studies included RIA or commercially available immunometric methods. To adjust for cortisol assay variability, first we excluded five studies that used the fluorescence polarization assay, which is less specific than more modern methods (8) and has never been used to evaluate LDCT. Second, because plasma cortisol is known to be consistently higher than serum cortisol levels, we converted plasma values (used in two studies) (6, 11) to their expected serum value by multiplying by the published correction factor of 0.877 (12).
We also investigated whether the results are sensitive to assay type by reassessing the SDCT and LDCT comparisons using the same thresholds for all studies (weighted mean cortisol thresholds) rather than individual study cortisol thresholds (see statistical analysis below).
SDCT
One of the two available synthetic corticotropin analogs, cosyntropin (Cortrosyn, Amphastar Pharmaceuticals, Inc.) or tetracosactrin (Synacthen, Novartis Pharma, Switzerland), was administered iv at a dose of 250 µg, and serum cortisol levels were obtained at baseline and at least once after injection (most commonly at 30 or 60 min).
A dose of 250 µg (0.25 mg) of cosyntropin or tetracosactrin is equivalent to 25 USP units of corticotropin, indicating the equivalence of Synacthen and Cortrosyn formulations. For brevity, we use the term corticotropin for both analogs, while acknowledging that Synacthen and Cortrosyn are synthetic corticotropin analogs, different from the native ACTH molecule.
LDCT
The low-dose test was performed in the morning with patients fasting. One of the two synthetic corticotropin analogs (cosyntropin or tetracosactrin) was administered intravenously in a 1- µg dose, after being prepared using the method of Dickstein and colleagues (13). Serum cortisol was measured at baseline and at 30 min after injection (except for one study, which measured it at 20 min). We excluded studies that used a different dose or different protocol.
Basal cortisol
All studies measured serum cortisol between 0800 and 1000 h after an overnight fast (basal cortisol). The database from one study (11) did not provide basal cortisol.
Statistical analysis
We conducted data analysis using Stata statistical software, version 10.0 (StataCorp LP, College Station, TX).
To compare performance of LDCT and SDCT (and basal cortisol), we used receiver-operator-characteristic (ROC) curve analysis. From each studys data, we calculated the area under the ROC curve (AUC) with 95% confidence intervals (CI) (Fig. 1
). The same methods were used to compare test performance at different time points for measuring stimulated cortisol after corticotropin injection (usually, 30 or 60 min).
|
To combine data across studies, we categorized cortisol responses into three intervals (high, indeterminate, and low likelihood of HPAI), using standard criteria applied consistently to all studies. Two thresholds were defined for each test: the threshold below which cortisol values had high likelihood of HPAI [likelihood ratio (LR) > 9; rule-in threshold] and the threshold above which cortisol values had a low likelihood of HPAI (LR < 0.15; rule-out threshold). Cortisol values between these thresholds (LR = 0.15–9) defined the interval with indeterminate likelihood of HPAI.
LR were calculated as the ratio of two probabilities: the probability of the test result among patients with HPAI divided by the probability of the same test result among patients without HPAI.
The use of categorized cortisol responses recalibrates individual study results to common metric, therefore permitting paired comparisons of the ROC areas for SDCT, LDCT, and basal cortisol, stratified by study and cortisol assay.
Comparison of corticotropin stimulation tests using summary cortisol thresholds
We combined results across studies by calculating a weighted average of the cortisol values defining the rule-in threshold and the rule-out threshold for each test (Table 1
). The weights were based on study sample size. Using these new summary thresholds, regardless of cortisol assay method, we recategorized the data from all studies and combined results in summary ROC curves. This analysis was performed to determine whether the results of SDCT and LDCT performance comparisons were sensitive to cortisol assay type (Fig. 2B
).
|
|
To address the research goal of defining an optimal testing strategy, we tested the sequential testing strategy of first measuring basal cortisol and then measuring a stimulated cortisol in only those subjects with an intermediate basal value. We used paired ROC curve analyses, adjusted for study size and cortisol assay (as described above), to compare LDCT and SDCT in subjects with indeterminate basal cortisol results (LR = 0.15–9 for HPAI).
Optimal testing strategy
Based on results of metaanalyses, we derived an optimal testing strategy algorithm (Fig. 3
). The basal cortisol and LDCT summary thresholds described in Fig. 3
were based on the mean cortisol values weighted by study sample size (last row of Table 1
).
|
| Results |
|---|
|
|
|---|
Cortisol testing methods varied from individual RIA kits (6, 7, 11, 14, 15, 16, 17) to immunometric test kits from various manufacturers (5, 18, 19, 20, 21, 22). The lack of a standard cortisol assay method (8, 9, 10) explains some of the variability in diagnostic cortisol thresholds reported across studies.
SDCT
After standard-dose corticotropin stimulation, there was variability across studies in the optimal timing for measuring cortisol response; however, in no study was there a statistically significant difference in diagnostic discrimination at 30 or 60 min or at peak response. In six studies, there was a trend favoring 60-min measurements; in six studies, the peak cortisol appeared best; and in four studies, 30-min testing appeared best. In our analyses of standard-dose testing, we used 30-min serum cortisol values and used peak cortisol in the studies where 30-min values were not available (7, 15, 17).
Combining results from 10 studies of standard-dose stimulation (346 subjects), 30-min cortisol values of less than 16 µg/dl (440 nmol/liter) were highly predictive of HPAI. Values greater than 30 µg/dl (833 nmol/liter) best predicted a normal reference test result (ruling out HPAI). Intermediate values, 16–30 µg/dl, were diagnostically indeterminate. The AUC for these categorized test results was 0.82 (95% CI = 0.78–0.86).
In paired analyses of the standard-dose stimulation test and the basal cortisol test (nine studies, 302 subjects), diagnostic discrimination was similar; AUC was 0.79 for the SDCT and 0.80 for the basal cortisol test (P = 0.45).
LDCT
After low-dose corticotropin stimulation, 30-min cortisol measurements had superior test characteristics in three studies (18, 19, 21), whereas two studies found 20-min values to be best (although not statistically different from 30-min values) (20, 22). In our analyses of LDCT, we used 30-min cortisol values (5, 11, 14, 16, 18, 19, 20, 21, 22) or, if not available, the 20-min (17) or peak values (7).
A metaanalysis of the 11 studies (589 subjects) using the LDCT found that values less than 16 µg/dl (440 nmol/liter) best predicted HPAI, whereas values greater than 22 µg/dl (600 nmol/liter) best predicted a normal reference test result. The AUC for these diagnostic thresholds was 0.94 (95% CI = 0.90–0.94).
In paired analyses (10 studies, 545 subjects), the LDCT was superior to the basal cortisol test in overall diagnostic discrimination; AUC was 0.80 for basal cortisol and 0.92 for LDCT (P = 0.01).
Comparison of LDCT and SDCT
In the seven studies with paired 30-min cortisol data for both tests (254 subjects), the LDCT had a larger AUC compared with the SDCT, 0.94 vs. 0.85 (P < 0.001), after adjusting for type of cortisol assay (Fig. 2C
). Excluding the eight patients with glucocorticoid-induced HPAI (7) did not change the results.
Among subjects with indeterminate basal cortisol values (5–13 µg/dl, or 138–365 nmol/liter) (paired data from six studies), the LDCT was more discriminating than the SDCT in diagnosing HPAI; AUC was 0.94 for low-dose and 0.75 for standard-dose test (P < 0.001).
Basal cortisol test
In a metaanalysis of 12 studies (635 subjects), a basal cortisol less than 5 µg/dl (138 nmol/liter) best predicted HPAI, whereas values greater than 13 µg/dl (365 nmol/liter) best predicted normal HPA axis testing. The AUC was 0.79 (95% CI = 0.75–0.82).
| Discussion |
|---|
|
|
|---|
Our results differ from the metaanalysis of Dorin and colleagues (23), who reported similar operating characteristics for the LDCT and SDCT. There are three possible reasons for the discrepant results. First, the metaanalyses differed in the studies that were included. All studies included in our metaanalysis, except three (5, 16, 17), were also included in the metaanalysis by Dorin and colleagues. The three exceptions were published after Dorins literature search. Dorin also included 12 studies that we decided not to include for the following reasons: five studies (published before 1990) tested cortisol levels using the fluorescence polarization assay, which is less cortisol specific than more modern assay methods (8) and has never been used to evaluate LDCT; two studies evaluated subjects in the early postoperative period; and five studies (3 with paired comparisons of LDCT and SDCT) were considered eligible for inclusion in our metaanalysis, but we were unable to obtain patient-level data. Among the three eligible studies with paired comparisons that were not included, one did not recruit consecutive patients (24); the second (25) reported superiority of SDCT, but no data were provided regarding statistical significance (their study also included subjects with glucocorticoid-induced HPAI); and the third (26) demonstrated results favoring LDCT, although the results were not statistically significant.
A second reason is that the metaanalyses differed in the study subjects that were included. Whereas the Dorin metaanalysis used summary data reported in the published articles, we obtained patient-level data from study investigators and could therefore apply patient-level eligibility criteria. For example, subjects were not eligible in our metaanalysis if they were healthy control subjects without any suspicion of HPAI (to reduce the risk of spectrum bias) (4) or if they had pituitary surgery within 4 wk of testing.
A third reason for the discrepancy is the difference in analytic methods. Because we had access to patient-level data, we were able to adjust for cortisol assay and sample size and also able to reanalyze data to define two (rule-in and rule-out) cortisol thresholds per test, instead of relying on the single cortisol threshold available in published reports.
In the unadjusted analysis (Fig. 2A
), the low-dose and standard-dose tests perform similarly; the AUC is slightly better for the LDCT, but the difference is probably clinically unimportant. However, when the analysis is adjusted for study size and cortisol assay, the superiority of the LDCT is more dramatic (Fig. 2C
) and likely to be clinically relevant. Using summary cortisol thresholds, regardless of cortisol assay method, we have found that the results of SDCT and LDCT performance comparisons were not sensitive to cortisol assay type (Fig. 2B
). Nevertheless, there may be clinical settings where SDCT is more appropriate to diagnose HPAI, especially if the quality of the low-dose testing protocol cannot be assured.
Based on our findings, we suggest a three-step approach for evaluating patients with possible HPAI (Fig. 3
). The first step is measuring a morning basal cortisol. If the results are not convincingly normal or abnormal (basal cortisol level falls in indeterminate range of 5–13 µg/dl), then the second step is performing a LDCT. If this test is indeterminate and there are no contraindications to integrated HPA axis testing, we suggest the third step of an insulin hypoglycemia test or metyrapone test. This three-step approach will accurately diagnose the majority of patients, but because it is not perfect, there will still be an important role for clinical judgment, especially regarding use of glucocorticoid supplementation during extreme stress. For convenience, in appropriate clinical circumstances, the first and second steps (basal and LDCT or SDCT) can be done at the same clinical visit to reduce the number of visits for testing.
Gleeson and colleagues (27) published a retrospective study of 10 yr of observational data on patients undergoing SDCT. The clinical diagnosis of HPAI was ascertained by an unblinded assessment of the clinical course depicted in medical records. Although the authors report a 97% negative predictive value for SDCT, the data do not allow calculations of sensitivity, specificity, or accuracy. The impressive negative predictive value might be partly explained by a low prevalence of newly diagnosed HPAI (19%), which is lower than in most studies included in our metaanalysis. The lower prevalence might be a result of reference test bias, because the clinical diagnosis of milder HPAI might be easily missed without the aid of integrated HPA axis testing. In addition, there is a significant risk of selection bias because 38% of patients were excluded from analysis.
Agha and colleagues (28) reported that only 3–8% of patients with an intermediate response to SDCT (30 min cortisol, 18.5–23 µg/dl or 510–635 nmol/liter) developed signs of adrenal insufficiency over an average follow-up of 4.2 yr. The implication is that a clearly negative standard-dose test result (>23 µg/dl, or > 635 nmol/liter) should perform even better than the intermediate results and effectively rule out HPAI. However, no data are provided regarding patients with either positive or negative standard-dose test results, and there is no assessment of the overall prevalence of adrenal insufficiency in the entire test population. In fact, for certain cortisol assay methods (6), a 30-min cortisol value in the range studied by Agha (18.5–23 µg/dl, or 510–635 nmol/liter) would be classified as a negative test, with a low likelihood of HPAI (Table 1
). Unfortunately, the data from Aghas study do not allow the calculation of test characteristics for positive or negative SDCT results.
Due to the lack of cortisol assay standardization and other reasons for measurement variability, the error in measuring cortisol can be up to 6 µg/dl (165 nmol/liter); thus, caution is advised when making clinical decisions based on cortisol values close to threshold values. In addition to high variability in the cortisol thresholds, another concern is that a test with a low likelihood of HPAI does not exclude the possibility of future HPAI, especially if there is progression of hypothalamic-pituitary disease or radiation therapy. Therefore, longitudinal assessments may be necessary.
The LDCT has not been validated in patients with acute illnesses, abnormal sleep-wake cycles, or acute hypothalamic-pituitary disorders (e.g. within 1 month of pituitary surgery). In addition, all studies of the LDCT that were included in this metaanalysis were conducted in the morning with the patients fasting. Afternoon cortisol values tend to be lower by 1–1.5 µg/dl (28–58 nmol/liter) (13, 29), and the effect of eating or drinking is uncertain. We also have no information on how the LDCT would perform among patients with low serum protein levels, because circulating cortisol is highly protein bound.
There are several technical details to performing a low-dose test that must be rigorously addressed to avoid false-positive test results (falsely low 30-min stimulated cortisol value). Currently, there are two acceptable corticotropin analogs that can be used, cosyntropin (Cortrosyn, Amphastar Pharmaceuticals) or tetracosactrin (Synacthen, Novartis Pharma), supplied in vials containing 250 µg powder. Preparing the 1-µg dose requires a several-step process of first reconstituting with 250 ml normal saline and then using a 1-ml aliquot (1 µg) for iv injection. There are additional steps for minimizing adherence of the medication to plastic tubing (30). In addition, the timing of cortisol sampling after low-dose corticotropin administration is very important (we recommend collecting the blood sample 20–30 min after corticotropin analog administration), because later sampling may result in false-positive results (31). Thus, low-dose testing should be performed only by personnel knowledgeable of the multiple steps required for preparation and administration. If the quality of administering a 1-µg dose is suspect, then we recommend using the standard dose of 250 µg (reconstituted with 1 ml sterile diluent) and measuring serum cortisol 30 min after iv injection. A result less than 16 µg/dl (440 nmol/liter), which is the same threshold used for LDCT, strongly suggests HPAI. However, with standard-dose testing, the 30-min cortisol value must be greater than 30 µg/dl (833 nmol/liter) to be reasonably confident in ruling out HPAI.
Research has suggested (32, 33) that dehydroepiandrosterone sulfate (DHEA-S) blood levels might also help with assessing the HPA axis, particularly when the results of the LDCT are close to either of the two threshold values.
In establishing the diagnosis of HPAI, the accepted reference standard is an abnormal insulin tolerance test or metyrapone test. Both tests, however, can be unreliable. The average intra-subject variability in peak cortisol response to insulin-induced hypoglycemia is 8–12% (34), but in men with hypopituitarism, it can vary by 42% (35). Healthy control subjects have been known to fail this test. Neither of these reference tests has been validated by assessing predictive accuracy, that is, the ability to predict adrenal crisis.
A limitation of this metaanalysis is that we could not include data of four studies that had published paired results of LDCT and SDCT, because we were unable to obtain the patient-level data (24, 25, 26, 32). One study favored LDCT (26), and another study (25) apparently favored SDCT; however, the latter study had an incomplete ROC to estimate the magnitude of the difference. The third study (32) found no difference in performance of the two tests. The fourth study (24) used nonconsecutive patients for testing and is therefore not pertinent to our metaanalysis.
In conclusion, the performance of the LDCT was superior to the standard-dose test for evaluating HPA insufficiency. However, LDCT must be used by personnel knowledgeable of the multiple steps required for preparation and administration. We describe a three-step testing strategy that proposes use of basal cortisol and the low-dose stimulation test before proceeding to one of the reference tests for HPA insufficiency.
| Footnotes |
|---|
First Published Online August 12, 2008
Abbreviations: AUC, Area under the curve; CI, confidence interval; HPA, hypothalamic-pituitary-adrenal; HPAI, hypothalamic-pituitary adrenal insufficiency; LDCT, low-dose corticotropin stimulation test; LR, likelihood ratio; ROC, receiver-operator characteristic; SDCT, standard-dose corticotropin stimulation test.
Received March 31, 2008.
Accepted August 4, 2008.
| References |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
A. Fernandez, M. Brada, L. Zabuliene, N. Karavitaki, and J. A H Wass Radiation-induced hypopituitarism Endocr. Relat. Cancer, September 1, 2009; 16(3): 733 - 772. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. A. Kaufman, L. Seravalli, E. Anton, and S. R. Bornstein Predisposing Factors for Adrenal Insufficiency N. Engl. J. Med., August 20, 2009; 361(8): 824 - 825. [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| Endocrinology | Endocrine Reviews | J. Clin. End. & Metab. |
| Molecular Endocrinology | Recent Prog. Horm. Res. | All Endocrine Journals |