| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
Division of Endocrinology and Metabolism (C.R.M.), Department of Medicine; and Department of Public Health Sciences (G.J.S.), University of Virginia Health System, Charlottesville, Virginia 22908
Address all correspondence and requests for reprints to: Christopher R. McCartney, M.D., Division of Endocrinology and Metabolism, Department of Medicine, Box 800391, University of Virginia Health System, Charlottesville, Virginia 22908. E-mail: cm2hq{at}virginia.edu.
| Abstract |
|---|
|
|
|---|
Objective: The objective of the study was to explore the relative desirability of alternative diagnostic approaches to small thyroid nodules using decision analysis.
Design: Four diagnostic approaches to a 10- to 14-mm thyroid nodule are modeled: 1) observation only, consistent with American Thyroid Association guidelines; 2) routine fine-needle aspiration biopsy (FNAB), an approach traditionally chosen by many endocrinologists and consistent with American Thyroid Association guidelines; 3) FNAB only when microcalcifications are present, as recommended by Society of Radiologists in Ultrasound guidelines; and 4) FNAB only when the nodule is hypoechoic and has at least one other ultrasonographic risk factor, as endorsed by American Association of Clinical Endocrinologists guidelines.
Main Outcome Measures: Measures included expected values; a priori likelihoods of prespecified outcomes; and two-way sensitivity analyses based on the utility of observation only in the setting of thyroid cancer and thyroid surgery for benign, asymptomatic thyroid disease.
Results: Expected values (EVs) were similar among decision alternatives modeling Society of Radiologists in Ultrasound guidelines, American Association of Clinical Endocrinologists guidelines, and routine observation (EVs from 0.912 to 0.927). Routine FNAB had the lowest EV (0.757–0.861), primarily related to a high a priori likelihood of having surgery for a benign nodule.
Conclusions: As a general approach to 10- to 14-mm thyroid nodules, routine FNAB appears to be the least desirable. This analysis offers additional data that physicians can use when choosing diagnostic approaches to small thyroid nodules based on perceived risks of delayed cancer diagnosis and unnecessary thyroid surgery.
| Introduction |
|---|
|
|
|---|
High-resolution thyroid US has enhanced evaluation of nodular thyroid disease, and several US characteristics (e.g. microcalcifications) are associated with an increased risk of malignancy (1, 2, 3). Nonetheless, none of these US findings is diagnostic, and fine-needle aspiration biopsy (FNAB) remains the cornerstone of thyroid cancer diagnosis. However, exactly which nodules should be targeted for FNAB remains controversial (1, 2, 3, 4, 5, 6, 7).
Routine FNAB of all thyroid nodules 10 mm or greater in diameter is an approach traditionally chosen by many thyroidologists. This approach is consistent with American Thyroid Association (ATA) guidelines (3) and with other expert recommendations (4, 5). The Society of Radiologists in Ultrasound (SRU) guidelines recommend FNAB of nodules 10 mm or greater only when microcalcifications are present; 15 mm or greater if completely or predominantly solid or if coarse calcifications are present; and 20 mm or greater if predominantly cystic with a solid component (1). American Association of Clinical Endocrinologists (AACE) guidelines state that FNAB should be performed on all hypoechoic nodules 10 mm or greater with one or more of the following US characteristics: irregular margins, chaotic intranodular vascular spots, a more-tall-than-wide shape, and microcalcifications (2). The National Cooperative Cancer Network Thyroid Carcinoma Clinical Practice Guidelines offer similar recommendations (8).
Thus, the approach to small (10–14 mm) thyroid nodules could vary, depending on which guidelines are followed. This discordance in part reflects disagreement regarding the balance between two primary risks: the risk of delaying appropriate surgical treatment when thyroid cancer is present and the risk of undergoing hemithyroidectomy/isthmusectomy for a benign, asymptomatic thyroid nodule.
We applied decision analysis to the frequently encountered clinical scenario in which a solitary, solid (or predominantly solid), 10- to 14-mm thyroid nodule is discovered in an asymptomatic and euthyroid patient with no clinical evidence of metastases (e.g. no abnormal cervical lymphadenopathy by palpation or US). We estimated the relative desirability of four decision alternatives: 1) observation only; 2) routine FNAB; 3) FNAB only when microcalcifications are present on US; and 4) FNAB only when the nodule is hypoechoic and has one or more additional US risk factors.
| Materials and Methods |
|---|
|
|
|---|
Decision analysis model
We developed a decision analysis model (see Fig. 1
and Table 1
) representing the four decision alternatives discussed above, primarily based on algorithms offered in AACE and ATA guidelines (2, 3). The model was developed for the baseline case of an asymptomatic patient with a solitary, solid (or predominantly solid), 10- to 14-mm thyroid nodule, with no concerning clinical features (e.g. no history of neck irradiation, no cervical lymphadenopathy), and a normal TSH. The model assumes that nodule characteristics will be adequately assessed by an experienced thyroid ultrasonographer and that FNAB will be performed under US guidance. The model incorporates events that follow four decision alternatives emanating from the models decision node (represented by the open square).
|
|
The second alternative is routine FNAB regardless of US characteristics. With this approach, FNAB is persistently inadequate (i.e. yields too few epithelial cells for interpretation after one or more repeat FNAB attempts) in 5–10% of cases, and these patients proceed to surgery (hemithyroidectomy/isthmusectomy). The likelihood of cancer at surgery is equal to the prevalence of cancer among solitary nodules; the remainder of surgeries will disclose a benign nodule (this utility value is assigned the variable surgery_benign). FNAB is indeterminate (e.g. follicular neoplasm) in 10–15% of cases. In this case, a radioactive iodine (I-123) scan is performed. A hyperfunctioning nodule is assumed to be benign, but all other cases go to surgery (hemithyroidectomy/isthmusectomy), which reveals the nodule to be benign in 80% (utility = surgery_benign) and malignant in 20% (utility = 1.0). When FNAB is neither inadequate nor indeterminate, it is either positive (suggesting cancer) or negative (suggesting benign disease). A positive FNAB prompts surgery: this reveals the FNAB to be a true positive (utility = 1.0) or a false positive (utility = surgery_benign). In the case of a negative FNAB result, the nodule is observed (e.g. follow-up US in 1 yr). A negative FNAB represents either a true negative (utility = 1.0) or false negative (utility = observation_cancer).
The third alternative involves the application of SRU guidelines (1) and starts with US assessment for microcalcifications. If US is negative (i.e. microcalcifications absent), the nodule is observed. This can be a true negative [probability = negative predictive value (NPV) of microcalcifications] or a false negative (probability = one minus NPV of microcalcifications). If US is positive (microcalcifications present), FNAB is performed. The remaining events in the decision tree are identical with those presented for routine FNAB, except that the pre-FNAB probability of cancer is approximated by the positive predictive value (PPV) of microcalcifications (i.e. approximately 28% based on baseline model assumptions).
The fourth subtree models the AACE guidelines (2) and approximates National Cooperative Cancer Network guidelines (8). Evaluation starts with an US assessment for hypoechogenicity, microcalcifications, irregular margins, chaotic intranodular vascular spots, and a more-tall-than-wide shape. If US does not disclose a hypoechoic nodule with at least one other concerning US feature (i.e. is negative), then the nodule is observed. This can be a true negative (probability = NPV of hypoechoic plus at least one other concerning US feature) or false negative (probability = one minus NPV). If US reveals a hypoechoic nodule with at least one other concerning US feature, FNAB is performed. The decision tree is thereafter identical with that for routine FNAB, except the pre-FNAB test probability of cancer approximates the PPV of hypoechoic plus at least one other concerning US feature (i.e. approximately 21% based on baseline model assumptions).
Decision analysis model parameters
Table 1
lists parameter estimates for the events, outcomes, and utilities represented in the model. These estimates were primarily derived from clinical guidelines (1, 2, 3, 8) or literature cited within the guidelines. The utility of missed (delayed) cancer diagnosis and the utility of surgery for a benign nodule were both assigned values of zero for the baseline case.
There was uncertainty regarding the appropriate estimate for some parameters. AACE guidelines cite FNAB sensitivity and specificity of 95% (2), whereas some literature suggests 98% (10, 11, 12). AACE guidelines suggest a 10–20% likelihood of inadequate FNAB (2), but it may be 5% when US guidance and repeat FNAB are routinely used (10, 11, 20, 21, 22, 23). Whereas ATA guidelines endorse a 15–30% probability of indeterminate FNAB reading (3), AACE guidelines and other sources suggest a 10% likelihood (2, 10, 11). Because of these uncertainties, and because these estimates substantially influence model results, we first performed decision analysis using conservative FNAB parameter estimates (i.e. 95% FNAB sensitivity and specificity; 10% likelihood of persistently inadequate FNAB; and 15% likelihood of indeterminate FNAB). We subsequently repeated the analysis using more optimistic FNAB parameter estimates (i.e. 98% FNAB sensitivity and specificity; 5% likelihood of persistently inadequate FNAB; and 10% likelihood of indeterminate FNAB).
Bayesian revision of probabilities
The key diagnostic tests in the model are FNAB, US for microcalcifications, and US for hypoechoic plus at least one other ultrasonographic risk factor. Bayesian revision was used to calculate the probability of a positive test result, the PPV of a positive test result, and the NPV of a negative test result using the following formula:
![]() |
These quantities were calculated using the estimates of test sensitivity P(B|A), test specificity [1 – P(B|not A)], and the pretest probability of cancer P(A) (i.e. the underlying prevalence of thyroid cancer in the group undergoing testing) with the following formulae:
![]() |
![]() |
![]() |
In the SRU and AACE arms, the pre-US probability of cancer equaled the prevalence of cancer among solitary thyroid nodules. In the routine FNAB arm, the pre-FNAB probability approximated the prevalence of cancer among solitary thyroid nodules. In the SRU and AACE arms, the pre-FNAB probability of cancer approximated the PPV of microcalcifications and hypoechoic plus at least one other ultrasonographic risk factor, respectively.
Inadequate and indeterminate FNAB results were considered to be neither positive nor negative and thus were not subjected to Bayesian revision. The baseline likelihood of cancer in these situations was considered to be 8 and 20%, respectively. Because the probability of cancer for indeterminate FNABs was fixed at 20% (rather than 8%), the overall percentage of cancer in each decision alternative subtree could vary slightly, depending on the proportion entering the indeterminate FNAB subtree. To correct this small inequality, the prevalence of thyroid cancer in those with adequate and determinate FNABs was calculated as the overall risk of cancer minus the a priori likelihood of having cancer after a negative US (if applicable), the a priori likelihood of finding cancer after surgery for persistently inadequate FNAB, and the a priori likelihood of finding cancer after surgery for indeterminate FNAB; all divided by the a priori likelihood of being in the adequate and determinate FNAB subtree. This correction ensured that the overall likelihood of reaching a terminal node denoting thyroid cancer was exactly 8% for each decision alternative subtree. The following equation represents this calculation for the SRU arm:
Probability of cancer in an adequate and determinate FNAB = (P(cancer) – ((P(pos_mCa) x P(inad_FNAB) x PPV(mCa)) + (P(pos_mCa) x P(indet_FNAB) x (1 – P(hyperfunction)) x P(inad_cancer)) + ((1 – P(pos_mCa)) x (1 – NPV(mCa)))))/(P(pos_mCa) x (1 – (P(inad_FNAB) + P(indet_FNAB)))).
Sensitivity analyses
One- and two-way sensitivity analyses were conducted to assess the extent to which results from the model depend on estimates for specific parameters in the model. Sensitivity analysis compares the EV obtained by decision alternatives across the plausible range of estimates for specific parameters in the model. One-way sensitivity analysis evaluates changes in the model results across values for a single model parameter while holding all other estimates constant. Two-way analysis considers changes for alternative combinations of two variable estimates while holding all others constant. The range of values (minimum to maximum) used for each parameter are shown in Table 1
.
Decision analysis software
All analyses were performed using TreeAge Pro 2006 (TreeAge Software, Inc., Williamstown, MA).
| Results |
|---|
|
|
|---|
One-way sensitivity analyses suggest that the highest EV obtained for the decision alternatives is sensitive to the parameters P(cancer), P(inad_FNAB), P(indet_FNAB), sens(mCa), and spec(mCa) when conservative FNAB parameters are used in the model, and P(cancer), P(inad_FNAB), P(indet_FNAB), spec(FNAB), and sens(mCa) when more optimistic FNAB parameters are used. Tables 2
and 3
detail the specific thresholds at which there is a change in the decision alternative with the highest EV. Routine FNAB had the lowest EV throughout all plausible parameter estimates used in these one-way sensitivity analyses.
|
|
|
56% of all cancer cases). An intermediate proportion (4.5–7.8%) of patients participating in the AACE approach will have surgery for benign disease, with few having a missed malignancy (0.8–1.0% of participating patients, or 10–12.5% of all cancer cases).
|
| Discussion |
|---|
|
|
|---|
Compared with routine FNAB, the approach outlined in SRU guidelines is associated with a much lower rate of surgery for benign disease but a substantially higher risk of missed malignancy. Approximately 56% of cancers would be observed for a period of time (e.g. until nodule growth is observed) under SRU guidelines. Both the lower risk of surgery_benign and the higher risk of observation_cancer reflect the low sensitivity of microcalcifications, which limits the number of FNABs performed. Given the higher sensitivity of hypoechoic plus at least one other ultrasonographic risk factor, AACE guidelines yield a lower risk of missed malignancy (10–12.5% of those with cancer). However, because more FNABs will be performed under AACE guidelines, the likelihood of surgery for benign disease exceeds that of the SRU approach. The EV of the AACE approach increases relative to that of the SRU approach as the estimated utility for missed malignancy decreases, whereas the EV of the SRU approach increases relative to that of the AACE approach as the utility of surgery for benign disease decreases.
Although the EVs of routine observation, SRU guidelines, and AACE guidelines are very similar at baseline, two-way sensitivity analysis suggests that the AACE or SRU approaches are preferred for most plausible combinations of utility estimates for surgery_benign and observation_cancer. Although routine observation will yield no cases of surgery for benign disease over the short term, cancer diagnosis will always be delayed with this approach. Routine observation has the highest EV only when the utility of surgery for benign disease is lower than that of missed cancer (Fig. 2
). Because such assignments seem generally unlikely, we anticipate that few would prefer routine observation.
We specified a solitary, solid (or predominantly solid) nodule for this analysis. However, this analysis is relevant to any nodule (solitary or part of a multinodular gland) with a pretest probability of cancer approximating 8%. We also assumed that FNAB would be performed under US guidance, but we recognize that not all endocrinologists use US guidance for every FNAB. If FNAB without US guidance were to substantially increase nondiagnostic and false-negative rates, compared with the estimated rates used in this analysis, then the EV results reported herein may not be applicable.
The time horizon considered in this analysis was approximately 1 yr. This time frame was chosen because many endocrinologists would perform follow-up US at 1 yr, and because observational data suggest that a greater than 1 yr delay of cancer diagnosis can negatively impact prognosis (24). Ideally, the decision analysis model would consider disease progression for patients with missed cancers and other events occurring over several years. However, we felt that such a model would be excessively complex and speculative. For example, the natural history of untreated 10- to 14-mm thyroid cancers remains ill defined, and it is unclear what proportion of initially observed cancers would grow sufficiently to prompt FNAB, over what time frame said growth would occur, what proportion would exhibit metastatic spread over time, etc. Also, many benign nodules would grow and prompt FNAB, adding to model complexity.
FNAB is a safe and relatively noninvasive diagnostic tool for assessing thyroid nodules. Use of FNAB has reduced the numbers of thyroid surgeries (thus reducing cost of care) and increased cancer yields at surgery (10). In general, FNAB is highly accurate (2, 3, 10, 11, 12). However, a major limitation of FNAB is inadequate or indeterminate results, which occur in 10–25% of cases (2, 3, 10, 11, 12, 20, 21, 22). In such cases, surgery is usually recommended for diagnosis (2, 3, 23), and the majority of these nodules will prove to be benign. Thus, the likelihood of having surgery for benign disease is proportional to the percentage of patients undergoing FNAB. There is great interest in molecular markers to distinguish benign from malignant nodules in the setting of an indeterminate FNAB; and to the extent that they would reduce the likelihood of surgery for benign disease, the EV of routine FNAB would increase. However, such markers are not currently reliable enough for clinical use (23). Given these limitations of FNAB, many propose that FNAB is best limited to nodules with high-risk US characteristics. Unfortunately, some thyroid cancers do not manifest concerning US characteristics (18, 19), and the risk of delayed diagnosis will increase when FNAB is restricted. In such cases, repeat FNAB should be pursued if significant nodule growth is observed (3, 11). However, the presence or absence of growth is not a reliable marker of a nodules malignant or benign, respectively, nature (2, 3, 10, 12).
Surgery for benign disease and delayed cancer diagnosis have different morbidity, mortality, and cost outcomes (see below). However, because utility values are by nature subjective, we assigned baseline utility values of 0 for both parameters. In this regard, the two-way sensitivity analysis is especially useful because it allows individual physicians and patients to determine a most-desired approach based on individually assigned utility values. To this end, relevant participants would first need to assign utilities to undergoing surgery for benign disease and missed (delayed) cancer diagnosis. A method by which individuals can reasonably assign utility values is required. Such methods include standard gamble, time trade-off, willingness to pay, and visual thermometer rating scales (25).
Before the assignment of utilities, the known and potential risks of thyroid surgery and delayed cancer diagnosis should be reviewed. Hemithyroidectomy and isthmusectomy is often recommended when FNAB is indeterminate or repeatedly inadequate. In experienced hands, this is a low-risk procedure: the risks of general anesthesia are low for most patients; and the risk of permanent hypoparathyroidism or recurrent laryngeal nerve injury is very low (<1%) (2, 26, 27). However, thyroid hormone supplementation will be required in 20–30% (28, 29, 30, 31). Moreover, the inconvenience and cost of surgery are substantial.
In contrast to the well-defined risks of thyroid surgery, the risks of delayed treatment of a 10- to 14-mm thyroid cancer (without clinical metastases) are unclear. Thyroid cancers are usually indolent, and the long-term prognosis of differentiated thyroid cancer is excellent, at least when treated promptly. The 5-yr survival rate for differentiated thyroid cancer is 99.7% when confined to the thyroid and 96.9% when there is regional spread (32). Similarly, 30-yr mortality is low (<1%) in patients with primary tumors less than 1.5 cm (24). Also, small (<15 mm, but usually <5 mm) papillary cancers may be seen in up to 6% of autopsies in the United States (33). This represents an estimated two-logarithm difference, compared with clinical cancer rates (34), suggesting that the majority of small thyroid cancers are not clinically relevant.
Size criteria for FNAB are largely based on the association between primary cancer size and prognosis (24). Routine FNAB of nodules less than 10 mm is often discouraged (3) because these microcarcinomas infrequently metastasize and carry a small risk of recurrence and mortality after surgical removal (35); and one study found no clear difference in cancer persistence, recurrence, or distant metastases when comparing papillary cancers 11–15 mm to those 10 mm or less (36). On the other hand, a minority of microcarcinomas follow a more aggressive course (2, 35, 37). One study suggested that the risk of local invasion/metastases increases with papillary cancers greater than 5 mm (38), whereas another study observed distant metastases with primary tumors as small as 8 mm (39). It is unknown to what extent early detection and surgical treatment of small cancers would prevent local or distant metastatic spread. Observational data suggest that differentiated thyroid cancer mortality rates may be doubled if initial treatment is delayed greater than 1 yr after discovery (24), although this analysis included primary tumors of all sizes and therefore may not apply to 10- to 14-mm cancers specifically. Thus, despite a 20–50% likelihood of lymph node involvement at primary surgical treatment (3), it remains unclear whether delayed diagnosis substantially impacts prognosis in the setting of a 10- to 14-mm thyroid cancer with no clinical evidence of metastases.
Lastly, one must also consider monetary costs and other implications of a general practice. For example, if FNAB were performed on all of the estimated 11–12% of the population with thyroid nodules 10 mm or greater (40), 1–2% of the population could receive hemithyroidectomy/isthmusectomy for benign disease.
In conclusion, routine FNAB has the lowest EV of the approaches evaluated in this decision analysis. The low EV of routine FNAB is primarily driven by the high percentage of patients who would have surgery for benign nodules. These results suggest that US criteria should be used to determine which 10- to 14-mm nodules should undergo FNAB.
| Footnotes |
|---|
Disclosure Statement: The authors have nothing to disclose. Neither author is a current member of the organizations responsible for the guidelines compared herein.
First Published Online May 27, 2008
Abbreviations: AACE, American Association of Clinical Endocrinologists; ATA, American Thyroid Association; EV, expected value; FNAB, fine-needle aspiration biopsy; NPV, negative predictive value; PPV, positive predictive value; SRU, Society of Radiologists in Ultrasound; US, ultrasonography.
Received February 26, 2008.
Accepted May 16, 2008.
| References |
|---|
|
|
|---|
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| Endocrinology | Endocrine Reviews | J. Clin. End. & Metab. |
| Molecular Endocrinology | Recent Prog. Horm. Res. | All Endocrine Journals |