|
|
||||||||
Thoracic Imaging |
1 From the Kurt Rossmann Laboratories for Radiologic Image Research, Department of Radiology, MC-2026, University of Chicago, 5841 S Maryland Ave, Chicago, IL 60637 (F.L., K.S., J.S., Q.L., H.A., R.E., H.M., K.D.); Department of Health Sciences, Faculty of Medicine, Kyushu University, Fukuoka, Japan (H.A.); and J. A. Azumi General Hospital, Nagano, Japan (S.S.). From the 2003 RSNA Annual Meeting. Received September 8, 2004; revision requested October 29; revision received December 27; accepted January 21, 2005. Supported in part by U.S. Public Health Service grants CA62625 and CA98119 Address correspondence to F.L. (e-mail: feng{at}uchicago.edu).
| ABSTRACT |
|---|
|
|
|---|
MATERIALS AND METHODS: Institutional review board approval and informed patient and observer consent were obtained. Seventeen patients (eight men and nine women; mean age, 60 years) with a missed peripheral lung cancer and 10 control subjects (five men and five women; mean age, 63 years) without cancer at low-dose CT were included in an observer study. Fourteen radiologists were divided into two groups on the basis of different image display formats: Six radiologists (group 1) reviewed CT scans with a multiformat display, and eight radiologists (group 2) reviewed images with a "stacked" cine-mode display. The radiologists, first without and then with the CAD scheme, indicated their confidence level regarding the presence (or absence) of cancer and the most likely position of a lesion on each CT scan. Receiver operating characteristic (ROC) curves were calculated without and with localization to evaluate the observers' performance.
RESULTS: With the CAD scheme, the average area under the ROC curve improved from 0.763 to 0.854 for all radiologists (P = .002), from 0.757 to 0.862 for group 1 (P = .04), and from 0.768 to 0.848 for group 2 (P = .01). The average sensitivity in the detection of 17 cancers improved from 52% (124 of 238 observations) to 68% (163 of 238 observations) for all radiologists (P < .001), from 49% (50 of 102 observations) to 71% (72 of 102 observations) for group 1 (P = .02), and from 54% (74 of 136 observations) to 67% (91 of 136 observations) for group 2 (P = .006). The localization ROC curve also improved.
CONCLUSION: Lung cancers missed at low-dose CT were very difficult to detect, even in an observer study. The use of CAD, however, can improve radiologists' performance in the detection of these subtle cancers.
© RSNA, 2005
| INTRODUCTION |
|---|
|
|
|---|
At baseline CT screening performed in a general population that included smokers and nonsmokers in Nagano, Japan(2), the fraction of lung cancers among the detected noncalcified lesions was 9% and the prevalence of cancers was only 0.48%. The corresponding data were 12% and 2.7%, respectively, for smokers in the U.S. Early Lung Cancer Action Project (3). In CT screening programs, however, 32%39% of lung cancers (4,5) were missed in previous years, and the numbers of these missed cancers were not included in the determination of the prevalence of lung cancers in these studies. We previously reported (5) that 32 missed lung cancers were very difficult to detect in the Nagano series; in general, they were very subtle and appeared as small, faint nodules with GGO that overlapped normal structures or as opacities in a complex background of other diseases.
When an automated lung noduledetection method (6) was used, 84% of these missed lung cancers in the Nagano series were marked by the computer; however, the false-positive rate was high (1.0 false-positive marks per section, 28 false-positive marks per study), and this is not acceptable to radiologists. Recently, we developed a computer-aided detection (CAD) scheme (7) that is based on a difference-image technique for enhancing lung cancers and suppressing most normal background structures, and the false-positive rate has improved to about 3.0 marks per study (sensitivity, 87%) with use of a multiple massive training artificial neural network (8). Thus, the purpose of our study was to retrospectively evaluate whether a difference-image CAD scheme can help radiologists detect peripheral lung cancers missed at low-dose CT.
| MATERIALS AND METHODS |
|---|
|
|
|---|
Database
An annual low-dose CT screening program for lung cancer in Nagano, Japan, began in May 1996 and ended in March 1999. In the program, 17 892 examinations were performed in 7847 individuals (4288 men, 3559 women; mean age, 61 years; age range, 1992 years). All individuals gave informed consent to undergo CT screening and for use of the data for research purposes. The database used in this study consisted of data from 38 low-dose CT examinations performed in 31 patients with missed peripheral lung cancers. All of the CT studies had been performed as part of the 3-year lung cancer screening program (5,6). Twenty-three cancers were missed because of detection errors, and 15 cancers were missed because of interpretation errors.
As described previously (5), the locations of missed lung cancers on sections obtained at 39 CT examinations (one examination was excluded from this study because of technical error) were determined in consensus by two radiologists (F.L. and S.S., with 20 and 42 years of experience, respectively). One radiologist (F.L.) measured the length and width of cancers on at least one section. Three radiologists (F.L., H.A., and H.M., with 20, 18, and 29 years of experience, respectively) first independently classified the low-dose CT scans with the 38 cancers into three patterns, and the final judgment was based on agreement by at least two radiologists. The mean diameter of the 38 lesions missed at low-dose CT was 12 mm (range, 626 mm). The following patterns were noted: 10 nodules had pure GGO (nonsolid), 16 had mixed GGO (part solid), and 12 had solid opacity.
The 31 missed cancers, which included 28 adenocarcinomas, two small cell carcinomas, and one squamous cell carcinoma, were confirmed with surgery. The CT examinations were performed with a mobile scanner (CT-W950SR; Hitachi Medical, Tokyo, Japan) with use of a low-dose protocol and a tube current of 25 or 50 mA, a scanning time of 2 seconds per rotation of the x-ray tube (tube rotation time, 2 seconds), a table speed of 10 mm/sec (pitch, 2), 10-mm collimation, and a 10-mm reconstruction interval. The mean number of sections per study was 30, and the pixel size was 0.586 or 0.684 mm for scans with a 512 x 512 image matrix size. The use of this database and the participation of radiologists in this observer performance study were approved by the University of Chicago Institutional Review Board. Informed consent for the observer performance study was obtained from all observers.
CAD Scheme
Our scheme was based on a difference-image technique (7,9,10) that enhances the lung nodules and suppresses most of the background normal structures. The difference image for each CT study was obtained by subtracting the nodule-suppressed image processed with a ring average filter from the nodule-enhanced image processed with a matched filter. By applying a multiple-gray-level threshold technique to the difference image, on which most nodules showed strong enhancement, the initial nodule candidates were identified. A number of false-positive findings were removed by using the two rule-based schemes on the localized image features related to morphologic characteristics and gray levels, and a false-positive rate of 15.8 per study was achieved (7). Most (81%) of the remaining false-positive findings were eliminated without removing any true-positive findings by using a multiple massive training artificial neural network trained to reduce various types of false-positive findings (8). The CAD scheme had a sensitivity of 87% (33 of 38 cancers) for 38 missed cancers, with an average of 3.0 false-positive findings per study (7,8).
Observer Study
Among 23 studies in which cancer was missed due to detection errors, 17 studies in 17 patients (eight men and nine women; mean age, 60 years; age range, 4869 years) were performed the year before the cancers were found; the other six studies, including three that were performed in the same 17 patients 2 years before the cancers were found and three that revealed a coexisting benign nodule (diameter, 45 mm), were not used in this investigation. All 17 cancers were adenocarcinomas. At low-dose CT, six nodules had pure GGO, 10 nodules had mixed GGO, and one nodule had solid opacity. The mean diameter of the 17 missed cancers was 10 mm (range, 617 mm). Fifteen studies in which cancer was missed due to interpretation errors were also excluded from the observer study. In addition, we included studies obtained in 10 control subjects (five men and five women; mean age, 63 years; age range, 4969 years) without cancer who had participated in the same screening program and whose ages and sexes closely matched those of the patient group; findings in these subjects were confirmed with 2-year follow-up. Some of the 27 studies revealed other abnormal findings such as scars, focal interstitial lung lesions, and small (<3 mm) benign nodules. The CAD scheme had a sensitivity of 82% (14 of 17 cancers), with 3.0 false-positive findings per study (range, zero to eight) for patients with missed cancers and 2.4 false-positive findings per study (range, zero to five) for the 10 control subjects (7,8).
Two image display formats were used in this investigation: a multiformat display and a "stacked" cine-mode display (Fig 1). For the multiformat display, from the top to the bottom of the entire lung for each patient, 27 consecutive sections with the original matrix size at low-dose CT were displayed in a multiformat display (3 x 3) on three high-spatial-resolution (1600 x 1200 pixels) liquid crystal display color monitors (CCL202; Totoku Electric, Tokyo, Japan). For cine-mode display, the same 27 CT sections for each study were magnified and stacked on one monitor. The speed or sequence of the image display for cine-mode display was controlled manually by the observer. The windowing in the two image display formats was initially set at lung settings but could be adjusted by the observer to bronchial or mediastinal settings. Two clinical parameters (age and sex) were provided to the observer on the monitor.
|
Radiologists were given the following instructions: "(a) We wish to evaluate radiologists' performance in detecting lung cancer without and with a CAD scheme on low-dose CT scans obtained from a screening program. (b) The role of the CAD output is that of a second opinion. (c) Twenty-seven low-dose CT studies (with 10-mm-thick sections) that did not or did contain lung cancer and/or noncancerous abnormalities such as benign nodules and scars are included in this observer study. (d) The observer in this study will be blinded to the number of patients with lung cancer and the performance level of the CAD scheme. (e) Click on the screen by using a mouse (i) to indicate on a bar your confidence level regarding the presence (or absence) of a lung cancer and (ii) to locate the most likely position on each CT scan. You may indicate the cancer location first and (f) click on one of the following four clinical actions: (i) Return to annual screening, (ii) diagnostic thin-section CT in 6 months, (iii) diagnostic thin-section CT in 3 months, or (iv) diagnostic thin-section CT immediately." The radiologists made their judgments first without and then with the CAD scheme.
For a training session before the test, we provided five different cases (that were not part of the study set of 27) so that radiologists could learn how to operate the cine-mode interface and how to take into account the computer output in their decision. The reading time was not limited in this study. The average reading time was 48 minutes (range, 2761 minutes; 1.8 minutes per case).
Statistical Analysis
The confidence level ratings from each observer were analyzed with use of the receiver operating characteristic (ROC) method, and a quasi-maximum-likelihood estimation of the binormal distribution was fitted to the radiologists' confidence ratings (11). The statistical significance of the difference in the area under the ROC curve (Az) between observer readings without and with the CAD scheme was tested with use of the Dorfman-Berbaum-Metz method (12), which included both reader variation and case sample variation by means of an analysis of variance approach. Localization ROC (LROC) curves (13) for observers without and with the CAD scheme were also determined for each reading condition.
The "proper" binormal model (14) was used to fit the ROC and LROC curves (Metz CE, written communication, 2004). In this study, localization was considered correct if the center of the cancer lesion was located within 15 mm from the point marked by the observer. The distance criterion of 15 mm was based on the fact that our database contained lesions with diameters as large as 26 mm. The distance was computed automatically by the user interface program. The sensitivity in this study was defined on the basis of the number of cancer lesions that were correctly located by an observer regardless of the confidence level ratings. The statistical significance of the difference in sensitivities between the computer outputs and the observer readings without and with the CAD scheme was tested by means of a confidence interval method by taking into account reader variation alone (15). The statistical significance of the difference in sensitivities between radiologists without and with the CAD scheme and in clinical actions between a beneficial and a detrimental effect of the CAD scheme for each of the studies that did or did not contain a lung cancer was estimated with use of the Student paired t test for the 14 radiologists. In general, P < .05 was considered to indicate a statistically significant difference.
| RESULTS |
|---|
|
|
|---|
|
|
|
|
|
|
|
Clinical Actions
For the four clinical actions described earlier (ie, return to annual screening or perform diagnostic thin-section CT in 6 months, 3 months, or immediately), we attempted to quantify the changes in clinical action attributable to use of the CAD scheme. For patients with a lung cancer, the average number for whom clinical actions were changed for a beneficial effect (ie, a "step up") (3.3) was greater than the number for whom clinical actions were changed for a detrimental effect (ie, a "step down") (0.4) (P < .001). For patients without a lung cancer, the average numbers affected by the CAD scheme for a beneficial effect (step down) and a detrimental effect (step up) were 0.5 and 0.3, respectively (P = .27).
Table 2 shows the number of patients for whom the important clinical action related to follow-up was influenced positively or negatively by the 14 radiologists. For these patients, the difference between the mean number of patients in whom the action was changed from screening to follow-up (2.1 patients) and the mean number of patients in whom the action was changed from follow-up to screening (0.3 patients) was significant (P = .005). For patients without a lung cancer, no statistically significant difference between a beneficial effect (a change from follow-up to screening [in 0.2 patients]) and a detrimental effect (a change from screening to follow-up [in 0.2 patients]) owing to use of the CAD scheme was found for the radiologists (P > .99).
|
| DISCUSSION |
|---|
|
|
|---|
There were some differences between the present study and the previous studies, as follows: In the present study, (a) the CAD scheme was developed by using missed lung cancers, which were confirmed at surgery; (b) the mean diameter of the cancers was 12 mm (all were at least 6 mm), and the CT findings for the cancers included lesions with pure GGO, mixed GGO, and solid opacity; and (c) ROC, LROC, and sensitivity analyses were used to evaluate radiologists' performance in the detection of subtle cancers without and with CAD. The importance of these differences is discussed in the next paragraphs.
Missed lung cancers include the most difficult cases for detection in clinical work and mass screening programs, and several investigators have reported the possible reasons for missing lung cancers on CT scans (4,5,22,23). In our series (5), lung cancers were missed mainly because they had low attenuation (eg, they were of small size and/or were faint lesions with GGO) or because of the presence of large structured noise elements (normal structures and/or complex backgrounds caused by other disease) or both. In addition, the cancers had poor conspicuity as defined by Kundel and Revesz (24). In general, the missed cancers corresponded to earlier visible findings in the same locations at previous examinationsfindings that had been identified as abnormal according to radiologists' consensus. However, in a previous study by Austin et al (25) of radiologists' performance alone, each of six radiologists, who were biased by knowledge that the patients had lung cancers that were missed on chest radiographs, missed cancer in a mean of 26% of 22 patients. The main purpose of our study was to identify whether unassisted radiologists could identify these previously missed cancers in the context of an observer study and to evaluate whether a CAD scheme could help them detect the cancers missed on CT scans.
Diederich et al (26) reported that more than 70% of noncalcified nodules are 5 mm or smaller, and no lung cancers were found among those small lesions in CT screening programs for lung cancer at baseline. Similar findings have also been reported by Swensen et al (27). Henschke et al (28) reported that the frequency with which malignancy was or could have been diagnosed when the largest noncalcified nodule was smaller than 5 mm in diameter was very low (0 of 378). The nodules with pure or mixed GGO on CT scans in lung cancer screening programs were more likely to be malignant than were solid nodules (29,30). Although there was a limitation in the low-dose CT protocol with 10-mm-thick sections used in our study, the cancers were at least 6 mm in diameter, and the CT findings for the cancers included lesions with pure GGO, lesions with mixed GGO, and lesions with solid opacity. We believe, therefore, that it may be more important for a CAD scheme used as a "second opinion" to detect relatively large nodules with or without GGO; such nodules include primary lung cancers more frequently than small nodules, most of which are benign lesions, do.
Basically, ROC analysis without localization (11,12) can help correctly evaluate observer performance in the detection of the presence (or absence) of a lesion on medical images when each image does not include obvious false-positive findings, provided that the number of patients is sufficiently large. However, because chest CT scans may contain pulmonary vessels or focal lung diseases that have an appearance that is similar to that of nodular lesions, high positive confidence level ratings by radiologists for a given CT study do not always correspond to true-positive findings (lung cancers) but instead sometimes correspond to false-positive findings. With use of LROC analysis (13), only the responses with correct localization are evaluated for each reading condition, although a proper statistical test for practical use in evaluating the difference between the curves is still unavailable. The shortcoming of LROC analysis for estimating sensitivity is that the radiologist's performance is evaluated only for patients with true-positive findings and not for patients with true-negative findings. Therefore, in this study, we decided to evaluate the performance with three methodsthat is, ROC, LROC, and sensitivity analysisand the results obtained with all three methods showed that the diagnostic accuracy of the radiologists improved with use of the CAD scheme.
Although the radiologists in our study were able to recognize the presence of some subtle lung cancers, they could not be sure whether the CT features of the lesion were indicative of malignancy even when the computer marked the lesion. The possible reasons why the sensitivity for radiologists who used CAD did not reach at least 82% include the fact that the radiologists were not familiar with the appearance of early lung cancers at CT, especially at thick-section CT. In addition, the sensitivity of the radiologists for detecting cancer lesions was affected by some findings such as scars and vertically oriented pulmonary vessels, which had an appearance similar to that of nodular lesions on CT scans in this observer study. In addition, false-positive computer findings would have an effect on radiologists' performance in the detection of lung cancer. We noted that radiologists tended to ignore the CAD output more frequently for studies with a large number of false-positive findings (eight per study, the largest in our scheme) than for those with a small number of false-positive findings. In a previous observer study of the use of LROC analysis in the detection of clustered microcalcifications on mammograms, Chan et al (31) reported that radiologists' diagnostic accuracy with CAD was further improved by reducing the computer's false-positive rate (from four to one false-positive finding per image).
In this observer study, the use of CAD had a detrimental effect in two patients for two radiologists. In one patient, a radiologist detected a cancer lesion without CAD with a confidence level of 0.46 and made a recommendation to follow up the cancer with diagnostic thin-section CT in 3 months. The computer indicated the cancer lesion and eight false-positive findings. With use of CAD in the same patient, a different radiologist changed the location from cancer to a false-positive finding (vertical pulmonary vessel) with a confidence level of 0.59 and did not change the clinical action. In another patient, another radiologist detected a cancer lesion without CAD with a confidence level of 0.31 and recommended follow-up with CT in 6 months. The computer did not mark the cancer lesion but indicated three false-positive findings. The radiologist who used CAD also changed the location from cancer to a false-positive finding (vertical pulmonary vessel) with a confidence level of 0.46 and did not change the clinical action. Therefore, when the CAD scheme yields false-positive findings that are very similar to true-positive findings, it may have a detrimental effect on the observers' performance when the task involves the detection of only one lesion at CT in an observer study. If radiologists were allowed to identify more than one lesion in an observer study, however, it is possible that they might elect to keep the cancer as detected initially and add the false-positive finding as a further suspicious area.
In recent CT screening programs, most images were reviewed in a multiformat display (film- or monitor-based viewing) and/or a cine-mode display (13,26,27). The cancers in this observer study were missed in the Nagano lung cancer screening project, in which a multiformat display (3 x 4 or 4 x 4) on two high-spatial-resolution (1728 x 2304 matrix) monitors was used (5). A similar multiformat viewing mode was used in our study by the radiologists in group 1. In general, cine viewing of CT scans of the chest is believed to improve radiologists' ability to detect lung nodules compared with film-based viewing (32,33). Tillich et al (33), however, found no significant difference between cine and film-based viewing in the detection rate of pulmonary nodules (metastases) larger than 5 mm in diameter. We also did not find a significant difference between the two viewing modes in the detection of primary lung cancers (
6 mm) missed in a CT screening program. The limitations of this study include the facts that the low-dose CT sections were thick (10 mm), rather than thin, and the radiologists differed in the two groups. It was not the purpose of our study to compare diagnostic accuracy with the cine or multiformat mode but rather to determine that the benefits of CAD were substantial, independent of the display mode used.
In summary, lung cancers missed at low-dose CT screening were very difficult to detect, even in an observer study; the use of CAD, however, improved the radiologists' performance in the detection of these subtle cancers. In addition, CAD can help radiologists make recommendations for follow-up.
| ACKNOWLEDGMENTS |
|---|
| FOOTNOTES |
|---|
Abbreviations: Az = area under ROC curve CAD = computer-aided detection GGO = ground-glass opacity LROC = localization ROC ROC = receiver operating characteristic
See Materials and Methods for pertinent disclosures.
Author contributions: Guarantors of integrity of entire study, F.L., K.D.; study concepts/study design or data acquisition or data analysis/interpretation, all authors; manuscript drafting or manuscript revision for important intellectual content, all authors; approval of final version of submitted manuscript, all authors; literature research, F.L., K.D.; clinical studies, F.L., H.A., S.S.; statistical analysis, F.L., K.S., J.S., Q.L., K.D.; and manuscript editing, F.L., H.A., K.S., Q.L., R.E., H.M., K.D.
| References |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
F. Girvin and J. P. Ko Pulmonary Nodules: Detection, Assessment, and CAD Am. J. Roentgenol., October 1, 2008; 191(4): 1057 - 1069. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Kasai, F. Li, J. Shiraishi, and K. Doi Usefulness of Computer-Aided Diagnosis Schemes for Vertebral Fractures and Lung Nodules on Chest Radiographs Am. J. Roentgenol., July 1, 2008; 191(1): 260 - 265. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. Goh, S. Halligan, A. Gharpuray, D. Wellsted, J. Sundin, and C. I. Bartram Quantitative Assessment of Colorectal Cancer Tumor Vascular Parameters by Using Perfusion CT: Influence of Tumor Region of Interest Radiology, June 1, 2008; 247(3): 726 - 732. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Beigelman-Aubry, P. Raffy, W. Yang, R. A. Castellino, and P. A. Grenier Computer-Aided Detection of Solid Lung Nodules on Follow-Up MDCT Screening: Evaluation of Detection, Tracking, and Reading Time Am. J. Roentgenol., October 1, 2007; 189(4): 948 - 955. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. E. Baker, L. Bogoni, N. A. Obuchowski, C. Dass, R. M. Kendzierski, E. M. Remer, D. M. Einstein, P. Cathier, A. Jerebko, S. Lakare, et al. Computer-aided Detection of Colorectal Polyps: Can It Improve Sensitivity of Less-Experienced Readers? Preliminary Findings Radiology, October 1, 2007; 245(1): 140 - 149. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. F. Branstetter IV Basics of Imaging Informatics: Part 1 Radiology, June 1, 2007; 243(3): 656 - 667. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. J. Jeong, C. A. Yi, and K. S. Lee Solitary Pulmonary Nodules: Detection, Characterization, and Guidance for Further Diagnostic Workup and Treatment Am. J. Roentgenol., January 1, 2007; 188(1): 57 - 68. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| RADIOLOGY | RADIOGRAPHICS | RSNA JOURNALS ONLINE |