|
|
||||||||
Breast Imaging |
1 From the Department of Radiology, Stanford University Medical Center, Rm S-068A, 300 Pasteur Dr, Stanford, CA 94305-5105 (D.M.I., R.L.B.); R2Technology, Sunnyvale, Calif (K.F.O.); TowerSt Johns Imaging Eisenberg Keefer Breast Center, John Wayne Cancer Institute, St Johns Health Center, Santa Monica, Calif (R.J.B.); and Department of Radiology, University of California, San Francisco (E.A.S.). From the 1999 RSNA scientific assembly. Received October 5, 2001; revision requested December 18; revision received March 14, 2002; accepted July 24. Address correspondence to D.M.I.
| ABSTRACT |
|---|
|
|
|---|
MATERIALS AND METHODS: Four hundred ninety-three pairs of consecutive mammographic findings were collected from 13 institutions, consisting of initial normal screening findings and a subsequent finding of cancer at screening (mean interval between examinations, 14.6 months). One designated radiologist reviewed each pair of mammograms and determined that 286 findings were judged visible at prior examination in locations where cancer later developed. Five blinded radiologists independently reviewed the prior findings in these 286 cases, identifying 169 mammograms (172 cancers) with findings so subtle that none or only one or two of the five radiologists recommended screening recall. Two unblinded radiologists reviewed the initial and subsequent findings and recorded descriptors and assessments for each finding and subjective factors influencing why, although the lesion was perceptible, it might have been undetected or not recalled.
RESULTS: Of 172 cancers, 129 (75%) were invasive (112 T1 tumors and 17 T2 tumors or higher; median diameter, 10 mm), and 43 (25%) were ductal carcinoma in situ (median size, 10 mm). On the prior mammograms, 80% (137 of 172) of these cancers had subtle nonspecific findings where cancer later developed, and most were assessed as being normal or benign in appearance.
CONCLUSION: There is a subset of cancers that display perceptible but nonspecific mammographic findings that do not warrant recall, as judged by both a majority of blinded radiologists and by unblinded reviewers. We believe failure to act on these nonspecific findings prospectively does not necessarily constitute interpretation below a reasonable standard of care.
© RSNA, 2003
Index terms: Breast neoplasms, diagnosis, 00.32, 00.30
| INTRODUCTION |
|---|
|
|
|---|
At screening mammography, failure to act on classic or atypical findings of cancer is usually due to errors in perception or analysis. On the other hand, failure to act on nonspecific mammographic findings is not tantamount to an error in judgment. In contrast to typical errors of detection or analysis, nondetection or nonaction with regard to nonspecific findings occurs because there are no classic or atypical signs of cancer to detect. Simply put, there are no abnormal findings to be overlooked or missed. Perceived but nonspecific mammographic findings would correctly be interpreted as normal or benign.
Few reports on missed cancers in the radiology literature describe nonspecific findings. The terms for nonspecific findings are often mentioned in reports of missed cancers, but the nonspecific findings are often grouped with other categories of missed cancers (46). In reviews (14,6) of prior mammograms that showed cancer at the time of screening or at time intervals between screenings, investigators named these findings "nonspecific signs," "minimal signs present," or "unspecific appearances" or grouped nonspecific findings with "unrecognized signs."
For purposes of the present study, we named this last category "nonspecific findings," defined as perceptible normal or benign mammographic findings where cancer later developed. Examples of such nonspecific findings included focal densities or benign-appearing calcifications occurring in the location where cancer later developed that were so nonspecific for malignancy that prospective recall from screening mammography was not warranted.
The purpose of our study was to retrospectively review these nonspecific findings on prior screening mammograms to determine what features were most often deemed normal or benign despite the development of breast cancer in the same location detected at follow-up screening.
| MATERIALS AND METHODS |
|---|
|
|
|---|
Four hundred ninety-three of the 1,083 cases with prior screening mammograms were available for review. The mean time between examinations was 14.6 months (range, 924 months). Sixty-two of these 493 cases were excluded because of prior breast surgery that resulted in scars or findings affected by metallic skin markers. Four other cases were excluded because the original mammograms were needed at the facility site before the end of our study, leaving a total of 427 cases in the study cohort.
One of three board-certified radiologists, other than the facility radiologists, reviewed the 427 cases to determine if the cancers were visible in retrospect on the prior mammograms. One radiologist reviewed 242 mammograms, one radiologist reviewed 103 mammograms, and one radiologist reviewed 82 mammograms. Each radiologist used the previously created film overlays to locate the cancer on the mammograms. The overlay was then superimposed on the prior mammograms with normal findings. If a perceptible finding was deemed visible on the prior mammogram, the radiologist marked the location of the retrospectively visible finding by using a second set of transparent film overlays, creating a reference location of the subsequent cancer on the prior mammogram.
Findings were judged visible on the prior mammograms in locations where cancer later developed in 286 of the 427 (67%) cases. The 286 prior mammograms were divided into four sets, each with approximately 75 cases. Forty-five additional cases were added to each case set: five cases in which no abnormalities could been seen on the prior mammograms, 20 cases with small subtle cancers, and 20 cases with normal findings (confirmed by means of at least one subsequent mammographic examination with normal findings in the following 2 years).
To determine if the normal findings on the prior mammograms should have been evaluated further, four groups of five radiologists performed a blinded review of the four case sets. The radiologists were unaware of the study purpose and the case mix. These 20 radiologists (10 with a primary focus in mammographic interpretation) were all certified according to the Mammography Quality Standards Act, had practiced radiology for a mean of 17 years (range, 335 years), and had read a mean of 300 screening mammograms per month (range, 401,000). Each radiologist independently assessed approximately 120 cases and categorized them according to the American College of Radiology Breast Imaging Reporting and Data System (BI-RADS) (9). BI-RADS categories 1 and 2 indicated normal or benign findings, and categories 0, 4, and 5 indicated abnormal findings. The use of category 3 (probably benign) was discouraged; however, the data showed 16 category 3 cases, which were grouped with the category 1 and 2 cases for purposes of our study. For clarity, we will refer to cases that the majority of five radiologists assessed as having a BI-RADS category of 0, 4, or 5 as abnormal, meaning the finding required immediate action. We will refer to cases that the majority of radiologists judged as having a BI-RADS category of 1, 2, or 3 as normal, meaning that the findings were normal or benign or did not require immediate action. The blinded radiologists were provided with patient age and shown only the prior mammograms with normal findings (mammograms obtained 924 months before the cancer was diagnosed at screening). They were not provided with any mammograms obtained earlier. The blinded radiologists had 84% mean sensitivity for cancer detection in the 20 cases with subtle findings that were added to each case set and 81% mean specificity for the 20 normal cases added to each case set.
Mammograms with three or more abnormal findings at the reference location as judged by the five blinded radiologists were considered to have cancers that were initially missed. The rationale is that if a majority of radiologists in a blinded review interpreted the findings as needing immediate work-up, then the finding was prospectively missed (8). One hundred twelve mammograms were judged as having abnormal findings according to these criteria by a majority of the radiologists and were excluded.
Three or more of the five blinded radiologists judged the findings on the remaining 174 mammograms as normal, benign, or requiring no immediate work-up. We classified these mammograms as having nonspecific findings by using the rationale that if a majority of radiologists in a blinded review interpreted the findings as normal, then the finding was very subtle, normal, or benign in appearance. Five of the 174 mammograms with nonspecific findings had incomplete or missing data and were excluded. The remaining 169 mammograms with 172 cancers constituted our final study group. At the time of the initial study, all mammograms were digitized with an LS85 digitizer (Lumisys, Sunnyvale, Calif) at 50-µm resolution and printed with an Imation HQ969 laser printer (Imation Enterprises, St Paul, Minn) at 12 bits per pixel and 100-µm resolution. These digitized images were then printed on film for subsequent case review.
Unblinded Radiologist Case Review
The purpose of the unblinded review was to have radiologists who specialized in breast imaging and who knew the reference location of the subsequent cancer assess the findings independently, to retrospectively reconfirm BI-RADS categories, to categorize the appearance of each finding, and to determine the reasons why the findings were so nonspecific as to merit prospective assessments of BI-RADS categories 1, 2, and 3 by a majority of the blinded radiologists. Two radiologists (D.M.I., R.L.B.) who specialized in breast imaging jointly reviewed the 169 mammograms in an unblinded review to categorize the findings and to assess possible reasons for nondetection and nonaction at initial assessment. To ensure that the digital copies were of sufficient quality for analysis, 20 original mammograms selected to include both masses (n = 12) and calcifications (n = 8) were recalled from the facility sites and compared side by side with the copies on dedicated mammography alternators. The original mammogram and the copies were rated by the two radiologists for image quality with a numerical rating of 15 (1 = unable to read, 3 = acceptable, and 5 = good) and a narrative description of mass or calcification visibility. The average quality ratings for the original images (4.5) and for the copies (4.4) were similar. The narrative descriptions indicated that the copy quality did not compromise mass or calcification detection, which further supports the acceptability of using image copies for our study.
To assess the mammographic characteristics of the visible findings, the 169 prior mammogram copies with normal findings, subsequent follow-up mammograms on which the cancers were detected, and the clear reference overlays were reviewed with a two-tiered dedicated motorized mammography alternator with bright lights and magnifying lenses available for use. The four-view prior mammogram with normal findings and the reference overlay were displayed on the top row, and the mammogram showing the cancer 924 months later with the reference overlay were displayed on the bottom row. At the time of case review, although the location of the subsequent cancer was available on the follow-up mammograms, no patient information, examination dates, or pathologic data were available to the reviewers.
The perceptible finding identified with the reference overlay on the prior mammogram was analyzed according to finding type, diameter, location, and depth within the breast. Breast density was also recorded. Each finding was categorized by using the BI-RADS lexicon for masses and calcifications and BI-RADS categories 05, excluding BI-RADS category 3 (9). As we endeavored to fully describe all findings, we added several nonBI-RADS terms to describe the normal and benign findings not addressed in the lexicon. Terms for nonspecific findings included focal islands of normal-appearing tissue, benign-appearing calcifications, few benign-appearing calcifications, and densities. Otherwise, findings were characterized by using the BI-RADS lexicon when possible. The term that best described the major character of the perceived finding was chosen as the finding type.
To explain possible reasons why the findings were originally interpreted as normal, we (D.M.I., R.L.B.) recorded features that might have led to dismissal of the finding, including an appearance of focal normal tissue, benign-appearing calcifications, too few calcifications to prompt patient recall, lucent lines within the finding that simulated intermixed fat or crossed Cooper ligaments, other similar calcifications in the breast, or findings too small to prompt work-up. We also recorded features that may have led to nonperception or nondetection of the finding, including findings seen on only one view, subtle clusters of calcifications, multiple findings in the breast, a single benign but prominent and distracting lesion, a finding located at the edge of the glandular tissue or at the edge of the image, dense breast tissue, focal overlying breast tissue or blood vessels that might obscure the finding, and a very large breast in which small lesions might be missed. Multiple factors were recorded for each lesion, if appropriate.
We reviewed the pathology reports for each cancer that was subsequently detected on the follow-up screening mammogram and compared each subtle finding on the prior mammogram with the subsequent cancer type and grade.
Statistical Analysis
A
2 test for concordance (CHITEST Function; Microsoft Excel, Redmond, Wash) was prepared to determine if more invasive carcinomas developed in recalled versus nonrecalled cases.
| RESULTS |
|---|
|
|
|---|
One hundred seventy-two cancers were detected at follow-up screening mammography in the 169 patients. The average time between the prior screening examination with normal findings and the cancer diagnosis assigned at follow-up screening was 14.6 months (range, 924 months). Forty-three (25%) of the 172 cancers were ductal carcinoma in situ (DCIS), and the remaining 129 (75%) were invasive cancers at the time of diagnosis. The median lesion size of DCIS was 10 mm (range, 275 mm). Twenty-four of 43 (56%) DCIS sizes were obtained from the pathology report, and the rest were obtained from the mammographic measurement of abnormal calcifications. The median lesion size of invasive cancers was 10 mm (range, 155 mm). One hundred nineteen of 129 (92%) invasive cancer sizes were obtained from the pathology report, and the rest were obtained from measurement of the abnormal finding on mammograms. Of invasive cancers, 112 were T1 tumors, and 17 were T2 or higher. Of 104 women with invasive cancer and known axillary node status, 22 (21%) had lymph nodes positive for metastatic disease.
Table 1 summarizes features that might have contributed to nondetection of findings as judged at unblinded review.
|
|
Table 3 summarizes the types of findings considered normal on prior screening mammograms, stratified into cases interpreted as normal in retrospect at unblinded repeat review (137 findings) and recall cases at unblinded repeat review (35 findings). The two categories of unblinded normal and recall cases are further subdivided into invasive cancers and DCIS found subsequently for each type of finding.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2 test were not significant (P = .06) for this comparison.
|
| DISCUSSION |
|---|
|
|
|---|
The results of our study demonstrate that there are subsets of perceptible but nonspecific findings on prior mammograms that should not be considered observer or interpretation errors. These nonspecific findings are composed mostly of densities that are indistinguishable from randomly distributed islands of fibroglandular tissue or scattered groups of tiny calcifications that most commonly represent fibrocystic change. Even though 137 of such findings were not recalled by a majority of five blinded radiologists and were judged as normal or benign at unblinded repeat review, 107 invasive cancers (78%) and 30 cases of DCIS (22%) developed at the same location as the subtle finding and were detected on follow-up screening mammograms 924 months later.
Few investigators describe nonspecific normal or benign findings on prior screening mammograms to serve as a comparison for our study. Unrecognized mammographic findings are often composed of both nonspecific findings and atypical but misinterpreted features of breast cancer. The minimal signs reported by van Dijck et al (10) comprised vague densities in 15 of 32 (47%) cases, densities in five (16%), microcalcifications in eight (25%), densities and microcalcifications in one (3%), and architectural distortion in three (9%). Similarly, our 137 nonspecific findings confirmed at unblinded review mostly comprised noncalcified findings, such as focal islands of fibroglandular tissue, masses, and densities (74 of 137, 54%), while fewer findings comprised calcifications (47 of 137, 34%). In a previous study, Ikeda et al (4) found 21 (22%) slightly abnormal findings that showed nonspecific signs in 96 interval cancers, including six nonspecific densities, four asymmetries, four benign-appearing calcification clusters, and four benign masses. Similar to those in the present study, the nonspecific signs of cancer in the study of Ikeda et al (4) were dominated by benign-appearing soft-tissue findings.
In two studies in which recall from mammographic screening would not be recommended, results were similar to ours. Wolverton and Sickles (5) prospectively evaluated 583 "doubtful mammographic findings" on screening mammograms in 382 women, in which all findings but one (low-grade DCIS) were normal at a mean follow-up interval of 30 months. Of note, most of their doubtful mammographic findings were composed of benign-appearing calcifications (48%), while the rest were noncalcified nodules (22%), vague densities (18%), asymmetries (7%), or a combination of findings (2%) that were interpreted as benign. The authors concluded that almost all of these prospectively marked benign findings were benign and inconsequential. Maes et al (6) reviewed nonspecific minimal signs (vague densities with an unsharp border, less than six clustered indefinable microcalcifications, and subtle architectural distortions) in a large population in the Netherlands to determine the frequency of recall and the effect these findings had on the screening program. Their nonspecific minimal signs were found in 53 (11%) of 500 women with normal or benign mammographic findings. The authors concluded that, on the basis of breast cancer prevalence and incidence in the Netherlands, the additional risk of such women developing breast cancer is about 0.5% and that regular mammographic screening, rather than recall, is a reasonable option. The results of these two studies show that nonspecific findings occur frequently in mammographic screening programs and that most do not represent cancer.
One limitation of our study was the lack of even older mammograms for comparison with the prior screening mammograms with normal findings. It is conceivable that comparison with these older mammograms might have shown that the nonspecific normal findings on our prior mammograms may have been either developing or new, warranting recall. We simply cannot tell or predict which of these cases could have been in this category. On the other hand, if the finding had been shown on an initial screening mammogram, as was simulated during the blinded review by the five radiologists, most radiologists would have interpreted the finding as normal. This shows that our case set contained findings that were very subtle or appeared benign or normal.
Another limitation of the current study is the second retrospective unblinded review that was necessary to characterize each subtle finding in this case set and to reconfirm BI-RADS categories. Unblinded retrospective reviews inherently lead to unintentional bias and may have led to the higher number of abnormal readings not evident in the initial blinded review by the five radiologists. Harvey et al (11) showed that nonpalpable breast cancers are often detected in retrospect on prior mammograms but that retrospective reviewers described the findings as suspicious more often than did blinded reviewers, a difference that was statistically significant. In their retrospective study, blinded reviewers were shown 73 prior mammograms with normal findings in patients who subsequently developed cancer, 30 (41%) of whom had mammographic evidence of cancer. The blinded reviewers interpreted these remaining 43 (59%) mammograms as normal, but retrospective reviewers found evidence of cancer (mostly asymmetric densities) on 25 mammograms (34%). Thus, it is not surprising that our unblinded review resulted in 35 suspicious findings in retrospect, likely because of the increased tendency of retrospective reviewers to detect abnormalities on mammographic review.
A possible criticism of our study is the initial grouping of BI-RADS category 3 cases with the BI-RADS category 1 and 2 cases. The 16 BI-RADS category 3 readings were a small percentage of all blinded readings, consisting of 1.8% (16 of 845) of all the readings in the 169 cases by the five radiologists. While this practice was discouraged in the initial blinded review, some radiologists still rated a few cases as BI-RADS category 3. We do not endorse this practice for clinical use, and in our practices, we do not allow final categorization of cases into BI-RADS category 3 without a recall from screening and a diagnostic work-up. For purposes of the present study, we grouped the 16 BI-RADS category 3 cases with BI-RADS category 1 and 2 cases by reasoning that these categories constituted findings that required no immediate action. Thus, one of the purposes of the second unblinded review was to have the two radiologists reconfirm overall BI-RADS categories. In this way, we could search for those findings that constituted BI-RADS category 1 and 2 cases at repeat review with the understanding that there are unintentional study biases introduced (addressed in the previous paragraph).
Interpretive error, involving either detection or diagnosis of breast cancer, is the most common reason that radiologists are sued for malpractice (11). At issue in such cases of alleged negligence is whether the abnormality identified on a prior mammogram should have been recalled or diagnosed as a suspicious finding by a reasonable and prudent radiologist practicing under similar circumstances (12). The essential element of such an analysis concerns foreseeable outcome; namely, is it reasonably foreseeable that the mammographic finding in question represents evidence of malignancy (13)? Conversely, is it reasonably foreseeable that the mammographic finding in question does not represent evidence of malignancy? The law requires neither a warranty of certainty nor accuracy, but it does require a reasonable approach (12,14,15).
The results of our study provide a basis for indicating which types of lesions may in fact represent cancer but which lack reasonable foreseeable outcome in a medicolegal context to necessarily prompt further evaluation because they are so nonspecific and are more likely to represent benign findings and normal variants. Expert witnesses reviewing cases of alleged negligence must distinguish between nonspecific findings that do not require recall and more important abnormalities that merit prompt additional imaging evaluation. To be fair and reasonable, expert reviews should be conducted in a manner that minimizes interpretation bias by blinding the reviewer to the location, mammographic features, and timing of subsequent cancer (similar to the blinded reviews of the four groups of five radiologists described in our study). In the setting of a medicolegal consultation, this can be achieved most effectively by viewing mammograms in the temporal sequence in which they were obtained and by limiting the reviewers access to supporting clinical information to that known by the interpreting radiologist at the time of examination.
If prior findings are sufficiently specific and suspicious to prompt recall, failure to do so may fall below a recognized standard of care. However, when mammographic findings are present in the area that later develops more specific features of malignancy, and those earlier findings are nonspecific or subthreshold, the results of our study support the notion that failure to recall the patient does not necessarily fall below the standard of care.
Thus, the mere presence of a prior finding in a patient who was not recalled at the time of screening does not constitute medical negligence or unreasonable interpretation, which is the basis for liability in an allegation of malpractice. None of our patients were recalled prospectively, given the limitations of the study design, and 137 of the cases were judged as having a BI-RADS category of 1 or 2 at unblinded repeat review, supporting the contention that at least some of these findings should not necessarily be recalled. These 137 perceptible but nonspecific findings did not warrant recall as judged by a majority of a group of five blinded radiologists and by two unblinded reviewers, and failure to recall these cases does not constitute diagnostic error.
In summary, the results of our study show that there is a class of nonspecific findings that are perceptible on prior screening mammograms that do not warrant recall, and despite their presence in a location where cancer would later develop, we believe failure to identify these findings prospectively does not necessarily constitute interpretation below a reasonable standard of care.
| FOOTNOTES |
|---|
Author contributions: Guarantor of integrity of entire study, D.M.I.; study concepts and design, D.M.I., R.J.B.; literature research, D.M.I., R.L.B., E.A.S., R.J.B.; clinical studies, D.M.I., R.L.B., K.F.O.; data acquisition, D.M.I., R.L.B., K.F.O.; data analysis/interpretation, D.M.I., R.L.B., E.A.S., K.F.O.; statistical analysis, D.M.I., K.F.O.; manuscript preparation, D.M.I., R.L.B., E.A.S., R.J.B.; manuscript definition of intellectual content, all authors; manuscript editing, D.M.I., R.L.B., E.A.S., R.J.B.; manuscript revision/review and final version approval, all authors.
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
P. Skaane, S. Hofvind, and A. Skjennald Randomized Trial of Screen-Film versus Full-Field Digital Mammography with Soft-Copy Reading in Population-based Screening Program: Follow-up and Final Results of Oslo II Study Radiology, September 1, 2007; 244(3): 708 - 717. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. J. Morton, D. H. Whaley, K. R. Brandt, and K. K. Amrami Screening Mammograms: Interpretation with Computer-aided Detection--Prospective Evaluation Radiology, May 1, 2006; 239(2): 375 - 383. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. J. Brenner, M. J. Ulissey, and R. M. Wilt Computer-Aided Detection as Evidence in the Courtroom: Potential Implications of an Appellate Court's Ruling Am. J. Roentgenol., January 1, 2006; 186(1): 48 - 51. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. M. Ikeda, R. L. Birdwell, K. F. O'Shaughnessy, E. A. Sickles, and R. J. Brenner Computer-aided Detection Output on 172 Subtle Findings on Normal Mammograms Previously Obtained in Women with Breast Cancer Detected at Follow-Up Screening Mammography Radiology, March 1, 2004; 230(3): 811 - 819. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Berlin, D. M. Ikeda, R. L. Birdwell, K. F. O'Shaughnessy, R. J. Brenner, and E. A. Sickles Missed Mammographic Abnormalities, Malpractice, and Expert Witnesses: Does Majority Rule in the Courtroom? [letter] * Dr Ikeda and colleagues respond: Radiology, October 1, 2003; 229(1): 288 - 289. [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||