Radiology
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Abstract Freely available
Right arrow Figures Only
Right arrow Full Text (PDF)
Right arrow Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when eLetters are posted
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Lewin, J. M.
Right arrow Articles by Cutter, G. R.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Lewin, J. M.
Right arrow Articles by Cutter, G. R.
(Radiology. 2001;218:873-880.)
© RSNA, 2001


Breast Imaging

Comparison of Full-Field Digital Mammography with Screen-Film Mammography for Cancer Detection: Results of 4,945 Paired Examinations1

John M. Lewin, MD, R. Edward Hendrick, PhD 2, Carl J. D’Orsi, MD, Pamela K. Isaacs, DO, Lawrence J. Moss, MD, Andrew Karellas, PhD, Gale A. Sisney, MD 3, Christopher C. Kuni, MD and Gary R. Cutter, PhD

1 From the Dept of Radiology, Univ of Colorado Health Sciences Ctr, CB E-030, 4200 E Ninth Ave, Denver, CO 80262 (J.M.L., R.E.H., P.K.I., G.A.S., C.C.K.); the Dept of Radiology, Univ of Massachusetts Medical Center, Worcester (C.J.D., L.J.M., A.K.); and AMC Cancer Research Ctr, Denver, Colo (G.R.C.). From the 1998 RSNA scientific assembly. Received Jan 4, 2000; revision requested Jan 25; revision received Jul 17; accepted Aug 2. R.E.H. supported by grant DAMD17-96-C-6104 from the U.S. Army Breast Cancer Research and Materiel Command. Address correspondence to J.M.L. (e-mail: john.lewin@uchsc.edu).


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
PURPOSE: To prospectively compare full-field digital mammography (FFDM) with screen-film mammography (SFM) for cancer detection in a screening population.

MATERIALS AND METHODS: At two institutions, 4,945 FFDM examinations were performed in women aged 40 years and older presenting for SFM. Two views of each breast were acquired with each modality. SFM and FFDM images were interpreted independently. Findings detected with either SFM or FFDM were evaluated with additional imaging and, if warranted, biopsy.

RESULTS: Patients in the study underwent 152 biopsies, which resulted in the diagnosis of 35 breast cancers. Twenty-two cancers were detected with SFM and 21 with FFDM. Four were interval cancers that became palpable within 1 year of screening and were considered false-negative findings with both modalities. The difference in cancer detection rate was not significant. FFDM had a significantly lower recall rate (11.5%; 568 of 4,945) than SFM (13.8%; 685 of 4,945) (P < .001, McNemar {chi}2 model; P < .03, generalized estimating equations model). The positive biopsy rate for findings detected with FFDM (30%; 21 of 69) was higher than that for findings detected with SFM (19%; 22 of 114), but this difference was not significant.

CONCLUSION: No difference in cancer detection rate has yet been observed between FFDM and SFM. FFDM has so far led to fewer recalls than SFM.

Index terms: Breast neoplasms, radiography, 00.321, 00.324, 00.327 • Breast radiography, comparative studies, 00.321, 00.324, 00.327 • Breast radiography, technology, 00.119 • Cancer screening, 00.321, 00.324, 00.327 • Radiography, digital, 00.321, 00.324, 00.327


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
In recent years, advances in screen-film technology and film-processing techniques have contributed to major improvements in the quality of mammographic images. Although screen-film mammography (SFM) provides a powerful tool for detection and follow-up of suspicious lesions, it has important limitations in detecting subtle soft-tissue lesions, especially in the presence of dense glandular tissues. One of the limitations of SFM is that the film serves simultaneously as the image receptor, display medium, and long-term storage medium. This limitation can lead to loss of image contrast, especially when exposure or film-processing conditions lead to lower optical densities in lesion-containing tissues (1,2).

Consequently, attempts have been made to develop digital image receptors as a substitute for the screen-film image receptors now used in mammography (35). Much of the current clinical experience with digital mammography has been derived from detectors with a small field of view that have been used for stereotactic core biopsies. These digital detectors have spatial resolutions of 5–10 line pairs per millimeter, limited primarily by the pitch of the charge-coupled device and by the use of minimizing optics to allow a 2.5 x 2.5-cm charge-coupled device to cover an adequate field of view for stereotactic imaging (typically 5 x 5 cm). The issues that relate to the performance of digital mammography detectors have been addressed in a number of studies (610).

Recently, prototypic whole-breast digital imaging systems have been introduced for clinical testing to compare full-field digital mammography (FFDM) to SFM. The varied designs of these digital imaging systems have been described elsewhere (3,4). The majority of clinical testing with these systems has been performed as part of studies conducted to help the manufacturers of FFDM equipment gain approval from the Food and Drug Administration for their devices (9), and the results of the studies have not been published. These studies use a cohort undergoing diagnostic mammography to provide a sufficient number of cancers for testing. Unfortunately, studies based on such populations are likely to suffer from entry bias if entrance into the cohort is predicated on an abnormal screening mammogram. The present study eliminates that form of entry bias by enrolling a screening cohort. To our knowledge, this study is the first to compare FFDM with SFM in a screening cohort. The goal of our study is to test whether FFDM is superior to, inferior to, or equivalent to SFM for screening for breast cancer. The study protocol is designed to minimize bias from preferential verification of findings detected with SFM ("verification bias") by recommending imaging work-up or biopsy on the basis of positive screening results with either FFDM or SFM.


    MATERIALS AND METHODS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
All women presenting for screening mammography at either of two university medical centers were eligible for enrollment if they were at least 40 years old, did not have breast implants, and each breast would fit entirely on a large (24 x 30-cm) screen-film image receptor. Each woman signed an informed consent form approved by the institutional review board of the site and the Institutional Review Board of the U.S. Army Medical Research and Materiel Command.

Image Acquisition
In 4,523 cases (91%), the subject underwent FFDM immediately after SFM. In these cases, the mammograms were acquired by the same technologist. In 422 cases (9%), the subject came to an outlying screening center for SFM and underwent FFDM during a separate visit to the radiology department within 3 days after SFM. In these cases, a different technologist acquired the full-field digital mammogram than the one who acquired the screen-film mammogram. Technique factors were recorded at the time of the SFM examination.

SFM was performed with a commercially available unit (General Electric DMR unit; GE Medical Systems, Milwaukee, Wis) by using a commercial screen-film system (Kodak Min-R 2000; Eastman Kodak, Rochester, NY). Technique factors (peak kilovoltage, radiation dose, target, and filter) for SFM were automatically selected by the unit (Automated Optimization Parameters feature). The contrast mode was used when the compressed breast thickness was less than 5 cm, and the dose mode was used when the compressed breast thickness was greater than 5 cm. Quality control procedures for SFM were in accordance with the guidelines of the Mammography Quality Standards Act.

FFDM was performed with a prototypic system that used an amorphous silicon area detector bonded to a cesium iodide crystal (GE Medical Systems). The pitch of the detector elements was 100 µm, yielding a limiting spatial resolution of 5 line pairs per millimeter. The active area of the detector was 18 x 23 cm, which yielded an image size of 1,800 x 2,304 pixels. The x-ray tube, support, and generator were identical to those in the commercial SFM system. Each digital detector system underwent extensive acceptance testing (7,8). Ongoing quality control for the FFDM systems included daily phantom imaging and weekly flat-field calibration of the detectors at each site.

Technique factors and breast doses for FFDM were matched to those of SFM by using the same target, filter, and peak kilovoltage as used for SFM. Values for radiation dose were matched as closely as possible between the two techniques, but only discrete mAs stations were available with FFDM. When the mAs value used for SFM was between two allowed choices on the FFDM unit, the lower mAs step was used for FFDM. Both FFDM and SFM units allowed the selection of rhodium/rhodium and molybdenum/rhodium target/filter choices in addition to molybdenum/molybdenum. Identical reciprocating grids were used for both SFM and FFDM. For the study, the mean peak kilovoltage was 26.90 kVp with each modality. The mean mAs value was 128.6 mAs for FFDM and 128.9 mAs for SFM.

Rather than trying to match compression force and compressed breast thickness, the technologist was instructed to obtain the best possible positioning and compression with each modality. Compression force and compressed breast thickness were recorded for each view.

Each examination consisted of the two standard screening views of each breast: craniocaudal and mediolateral oblique views. Both 18 x 24-cm and 24 x 30-cm screen-film image receptors were used for SFM. The image receptor size of the FFDM prototypic unit was 18 x 23 cm. If a subject’s breast was too large to fit on the 18 x 24-cm screen-film image receptor, she was advised after SFM that additional exposures would be needed for her FFDM to obtain coverage of each breast in each of the two standard projections. If she chose to continue in the study, her breast was imaged with FFDM by overlapping as many 18 x 23-cm views as needed to cover each breast. This typically resulted in one or two extra exposures per breast.

Image Interpretation
Screen-film images were interpreted with the routine clinical images of that day in a darkened room on a standard mammography alternator with a luminance of at least 3,000 candelas per square meter. The reader knew at the time of interpretation that the examination was part of the study.

Digital images were interpreted in soft copy on a prototypic Unix-based workstation consisting of a computer (UltraSparc; Sun Microsystems, Palo Alto, Calif) with dual 21-inch-high luminance monitors, each capable of displaying 1,800 x 2,300 pixels (Megascan, North Billerica, Mass). These were driven by 4-megabyte video cards with 8-bit output (Dome, Waltham, Mass). The images were interpreted without postprocessing other than an initial window and level setting derived from the image histogram. A gamma function was used to map the 14-bit image data to the 8-bit display data.

The radiologist had full freedom to adjust window and level and to magnify each image interactively. Magnification of each view to x2 power was typically performed for each reading. For the first approximately 200–400 cases, magnification was accomplished with a moving square showing a portion of the image magnified ("mag glass"). Subsequently, each quadrant of each image was magnified in its entirety and examined.

The digital workstation was located in a darkened room away from mammographic alternators. Comparison mammograms were viewed on a standard light box placed next to the digital workstation. The light box was turned off during the detailed evaluation of digital images to avoid glare.

The screen-film mammogram and the digital mammogram were interpreted at the institution at which they were acquired. Interpretation was done independently by board-certified radiologists qualified under the Mammography Quality Standards Act who also interpreted clinical mammograms at that institution. For a given subject, one radiologist interpreted the SFM and another the FFDM. Each reader was blinded to the results of the other and to the images from the other modality. Comparison images, prior reports, and the subject’s history were available to each radiologist for both the SFM and FFDM interpretations. At one institution, residents participated in some of the interpretations in the study but never participated in both the SFM and FFDM readings of a given subject; at the other institution, residents did not participate. Resident participation was not recorded but was more likely to occur with the SFM reading than the FFDM reading. The attending radiologist, of course, gave the final interpretation. A given attending radiologist was required to interpret approximately equal numbers of SFM and FFDM examinations. Both the SFM and the FFDM interpretations were used for clinical management of the patient.

For each finding, the reader was required to give the following information:

  1. A description of the type and location of the finding by using the nomenclature of the Breast Imaging Reporting and Data System (BI-RADS) (11).
  2. A BI-RADS assessment of 0 (need additional imaging evaluation), 2 (benign), 3 (probably benign), 4 (suspicious abnormality), or 5 (highly suggestive of malignancy). The BI-RADS assessment category 1 (negative) was not used to describe a finding because it is reserved for cases with no findings.
  3. A BI-RADS recommendation.
  4. 4. An integer percentage probability from 0% to 100% that the finding represented cancer. This value is used for free-response receiver operating characteristic (ROC) analysis.

Additional Imaging Evaluation of Findings
Additional imaging evaluation of a given finding was performed to establish truth for that finding in a manner that minimized bias toward either modality. Both additional mammographic views and ultrasonographic images could be obtained as indicated. Concordant findings (ie, those detected with both modalities) needing additional mammographic views were imaged with SFM. Discordant findings (ie, those detected with one modality only) needing additional mammographic views were imaged with the modality with which they were detected, with the exception that magnification views were obtained by using SFM only. This exception was because of the inability to easily remove the grid on the FFDM prototype. Thus all calcifications were worked up with SFM.

Review of Discordant Cases
For all findings recalled for additional evaluation on only one of the two modalities, the two radiologists would evaluate the FFDM and SFM images side by side. At this time, the radiologists had the option of dismissing a finding (ie, not working it up) if, after viewing both images, they could determine a benign cause for the finding or believed that there was no reasonable chance that it represented cancer. Dismissed findings were still counted as positive in calculations of recall rate, sensitivity, and other performance measures.

Data Collection
Data were collected on paper forms and then entered into a customized database program (Microsoft Access 2.0; Microsoft, Redmond, Wash). Each subject filled out a history form that detailed their risk factors and demographic data. The technologist reviewed the forms with the subject. The technologists filled out forms that specified the reason for the examination and the technique factors. The radiologists filled out the interpretation forms. Computer data entry was performed by a professional research assistant at each site.

Long-term Surveillance
Each woman in the study is followed for 1 year following her participation in the study to assess for possible development of cancers that were not detected with either modality or were detected at screening but not subjected to biopsy. Surveillance comes for most subjects when they return to the participating institution for subsequent screening mammography. Those who do not return are contacted by telephone or mail to determine if they have undergone subsequent mammography and whether a diagnosis of breast cancer has been made either through mammographic or clinical detection. In addition, for subjects from one site, cases of breast cancer recorded in the state tumor registry will be periodically cross-matched against subjects in the study. Permission to obtain this information was included in the informed consent.

Data Analysis
For calculating sensitivity, a finding was considered to have been called with a given modality if it was assigned a recommendation for immediate work-up, such as additional imaging, obtaining prior images for comparison, or biopsy. These recommendations generally corresponded to a BI-RADS assessment of 0, 4, or 5. A finding detected but given a recommendation of short-interval (6-month) or routine follow-up was considered negative. These recommendations generally corresponded to a BI-RADS assessment of 2 or 3. Each finding was assessed separately to determine truth by imaging work-up and, if warranted, by biopsy. The final assessment of truth for a finding determined benign with additional imaging was modified if a biopsy performed on the finding within 1 year demonstrated cancer.

For purposes of establishing recall rate, an examination was considered positive for recall (a) if immediate work-up was recommended or (b) if comparison studies were recommended and those comparison studies led to additional work-up. For calculating the number of biopsies and the positive biopsy rate, fine-needle aspiration of solid masses, core-needle biopsy, and surgical biopsy were all included. The pathologic classifications of benign, high-risk, and malignant follow the second edition of BI-RADS (11). In this classification, atypical ductal hyperplasia and lobular carcinoma in situ are considered high-risk lesions.

ROC curves were constructed by using the alternative free-response ROC method (12). To use this approach, the value of the highest rated benign finding in each breast was assigned as the false-positive level for that breast. If no finding was called in a given breast, that breast was assigned a false-positive level of 0. Each cancer was assigned the rating given to it; if it was not detected, it was assigned a rating of 0. With alternative free-response ROC analysis, the area under the ROC curve, A1, is analogous to the area under a standard ROC curve, Az, allowing the use of standard ROC analysis techniques (1215). The 101-point scale was used without binning. Curve areas were integrated by using the trapezoidal rule.

Tests for the significance of the sensitivity, recall rate, positive biopsy rate, and positive predictive value were performed by using the McNemar {chi}2 test and, if found significant with that test, by using a generalized estimating equations model implemented with the PROC GENMOD command in SAS software (SAS Institute, Cary, NC). The latter is a stricter test of significance because it takes into account the variation among readers and adds this random effect to the variance used in testing.


    RESULTS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
Results are given for subjects enrolled at one site as of May 21, 1999, or at the other site as of March 22, 1999. After these dates, new workstations with higher screen resolution and automatic postprocessing were installed at each site, which changed the reading environment for FFDM. In the cohort of subjects whose findings are being reported, 4,945 examinations were conducted in 3,890 women. For the 1,055 women in the study who enrolled twice, a minimum of 11 months separated the two examinations, with a mean separation of 12.8 months. Of the 4,945 examinations, 2,882 were conducted at one site and 2,063 at the other. Results are given for the combined population except where noted.

Table 1 summarizes demographic data and risk factors for the women enrolled in the study. The numbers approximate those of the screening population at each institution, with the exception of a relatively smaller number of subjects undergoing their first screening mammography who were enrolled in the study. The biggest difference between the populations at the two institutions was that almost twice as many women at site 1 were receiving hormone replacement therapy.


View this table:
[in this window]
[in a new window]

 
TABLE 1. Demographic and Risk Factors for the Study Cohort by Site and Combined Data
 
Table 2 compares the mean compression force and compressed breast thickness for images obtained with each modality. The differences are within the measurement error of the machines.


View this table:
[in this window]
[in a new window]

 
TABLE 2. Compression Force and Compressed Breast Thickness by Modality
 
Table 3 summarizes the distribution of the radiologist’s subjective rating of breast composition or mammographic density by using the BI-RADS nomenclature for each modality. Agreement is close between modalities for this parameter.


View this table:
[in this window]
[in a new window]

 
TABLE 3. Radiologist Rating of Breast Composition by Modality
 
In the study cohort, there were 1,448 findings for which immediate evaluation was recommended by at least one of the two readers. Of these findings, 507 were called positive only with FFDM, 746 were called positive only with SFM, and 195 were called positive with both modalities.

Table 4 gives the outcome for findings grouped by the modality with which they were called. Sixty-seven (13%) of the 507 FFDM-only findings and 14 (2%) of the 746 SFM-only findings were dismissed at the discrepancy conference. All of the subjects with dismissed findings have been followed for at least 1 year. Three of the FFDM-only findings that had been dismissed became palpable within 1 year. Two were invasive lobular carcinomas. The third was a fibroadenoma. No other dismissed finding has undergone suspicious change or warranted biopsy for any other reason. Most were not present at the follow-up examination. Four other patients refused additional imaging and have unresolved findings. Eighty-three (11%) of the 746 SFM-only findings and 38 (7%) of the 507 FFDM-only findings were biopsied; 31 (16%) of the 195 findings called with both modalities were biopsied.


View this table:
[in this window]
[in a new window]

 
TABLE 4. Outcome of Findings by Modality
 
Table 5 gives the distribution of finding type grouped by the modality with which the finding was detected. The vast majority (133 of 195; 68%) of findings that were called with both modalities (concordant findings) were masses. Still, consideration of the row totals (Table 5) shows that 487 (78%) of the 620 mass findings were discordant, that is, called only with one modality. Calcifications were also likely to be called with only one modality, usually SFM. Of the 296 calcification findings in the study, 181 (61%) were called only with SFM, 83 (28%) were called only with FFDM, and only 32 (11%) were called with both modalities. The greater proportions of discordant masses, calcifications, and total findings called with SFM are all statistically significant (P < .001).


View this table:
[in this window]
[in a new window]

 
TABLE 5. Distribution of Finding Type by Modality
 
Recall Rates
Recall rates were determined on the basis of positive examinations, rather than positive findings, consistent with the standard practice for the mammography audit (16). A total of 568 examinations were positive with FFDM, for a recall rate of 11.5% (568 of 4,945). A total of 685 examinations were positive with SFM, for a recall rate of 13.8% (685 of 4,945). Two hundred sixteen of these examinations were called positive with both modalities, giving a total of 1,037 examinations called positive, for a 21.0% recall rate for study participants. The recall rates at site 1 were 8.6% (248 of 2,882) for FFDM and 11.9% (344 of 2,882) for SFM. The recall rates at site 2 were 15.5% (320 of 2,063) for FFDM and 16.5% (341 of 2,063) for SFM. Findings dismissed at the discrepancy conference are counted as recalls in these calculations.

Findings Subjected to Biopsy and Cancers Detected
One hundred fifty-two findings were subjected to biopsy in the study, including 23 found to be benign with fine-needle aspiration. Thirty-five cancers were diagnosed. Of these, 26 were invasive, and nine were ductal carcinoma in situ. Nine of the cancers were detected with FFDM only, 10 with SFM only, and 12 with both modalities. None of these subjects were clinically suspected of having cancer at the time of mammography. Four interval cancers were not detected with either modality and were detected clinically within the next 11 months. Two of these were invasive lobular carcinoma, and two were invasive ductal carcinoma. Table 6 gives the outcome of the biopsies grouped by the modality with which the finding was detected at screening.


View this table:
[in this window]
[in a new window]

 
TABLE 6. Results of Biopsy
 
Table 7 gives the number of mammographically detected cancers grouped by mammographic lesion type and by the modality or modalities with which the cancer was detected. No trend is demonstrated except for architectural distortion. Five of the eight cancers detected as architectural distortion were called only with SFM versus two called only with FFDM. This difference is not significant (P > .2).


View this table:
[in this window]
[in a new window]

 
TABLE 7. Mammographically Detected Cancers by Mammographic Lesion Type
 
Long-term Surveillance
To date, we have 1-year follow-up on 2,929 of the 4,945 examinations in the study. In 2,861 cases, the surveillance came when the subject returned to the institution for another mammogram, either as part of the continuation of this study or outside of the study. Six patients presented with findings from clinical examination. Sixty-three subjects have been contacted by telephone.

Sensitivity, Positive Predictive Value, and ROC Analysis
The sensitivity of FFDM in the detection of screening cancers was 60% (21 of 35). The sensitivity of SFM was 63% (22 of 35). Because not all subjects have been followed for 1 year to obtain complete ascertainment of interval cancers, this represents an upper bound of the sensitivity for each modality. The relative sensitivity of FFDM to SFM was 95% (21 of 22).

The positive predictive value of screening is defined as the fraction of recalled examinations that led to a diagnosis of breast cancer. The positive predictive value of screening with FFDM was 3.7% (21 of 568). The positive predictive value of screening with SFM was 3.2% (22 of 685). The positive biopsy rate is defined as the fraction of biopsies that yielded cancer. The positive biopsy rate for all findings detected with FFDM was 30% (21 of 69); the positive biopsy rate for all findings detected with SFM was 19% (22 of 114).

ROC curves for the two modalities are presented in the Figure. Alternative free-response ROC analysis was performed by considering each breast separately, for a total of 9,716 data points (194 subjects had previously undergone mastectomy). The area under each curve is 0.76.



View larger version (22K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 1. Alternative free-response ROC curves for SFM and FFDM plotted by using a rating scale from 0-100. The x-axis scale is the probability of a false-positive finding being called with the two screening views of a given breast. This is analogous to the false-positive rate in a standard ROC experiment. There is no difference in the areas under the curves. The area under each curve is 0.76.

 
Statistical Analysis
The lower recall rate for FFDM compared with SFM was significant when analyzed with both the McNemar {chi}2 model (P < .001) and the generalized estimating equations model (P < .03). The difference in sensitivity between the two modalities lacked significance (P > .5, McNemar {chi}2). The higher positive predictive value of screening for FFDM compared with SFM was found to lack significance (P > .3, McNemar {chi}2), as did the higher positive biopsy rate for findings detected with FFDM as compared with those detected with SFM (P > .08, McNemar {chi}2).


    DISCUSSION
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
The introduction of FFDM has been eagerly awaited as a way to increase the number of cancers that can be detected with mammography. Because many cancers that cannot be detected with mammography are in dense tissue, the presumption has been that the greater contrast resolution of FFDM would enable the demonstration of at least some of these cancers. This presumption of the superiority of FFDM has been tempered by the knowledge that SFM has an advantage in high-contrast spatial resolution and may have other advantages not so easily defined, including familiarity to the radiologist.

To determine how the advantages and disadvantages of the two modalities contribute to their performance in cancer detection, we are conducting a clinical study. Because it is possible that FFDM may be superior to, equal to, or inferior to SFM, we have designed our study to be able to test for all of these possibilities. To do so necessitates that we use what is essentially a screening population because basing enrollment on a positive clinical (screen-film) mammogram would bias the results toward SFM in terms of sensitivity and toward FFDM in terms of specificity (11). For example, had we selected only the subjects who had positive SFM examinations to receive FFDM, we would have detected only 22 cancers and concluded that SFM was more sensitive than FFDM because it detected all of them, whereas FFDM had detected only 12.

Unfortunately, the use of a screening population results in a relatively low number of cancers, typically 2–10 per 1,000 women screened, depending on the proportion of first-time screens (16). This low cancer rate decreases the power to detect a difference between the modalities and necessitates studying a large population. Our cancer detection rate per 1,000 women screened was 4.5 for SFM. This value is in the expected range for our cohort, which consisted primarily of repeat screens (1622). What could not be predicted was the 39% increase in the cancer detection rate, to 6.3 per 1,000, when FFDM was used in addition to SFM. This boost in the cancer detection rate from the addition of FFDM increases the power of the study to above that which would be predicted on the basis of published values for yearly incidence of breast cancer.

The study design maximizes statistical power by having each subject undergo both SFM and FFDM, thereby acting as her own control. This design allows the use of statistical methods for matched paired data (23,24). The power of these methods is largely determined by the number of discordant cases. Thus the statistical power is further increased by the surprisingly large fraction of cancers (19 of 31) detected with only one modality.

A potential source of bias in our study is the higher number of FFDM-only than SFM-only findings dismissed at the discrepancy conference: 67 versus 14, respectively. In designing the study, we allowed a finding to not be worked up if, by consensus, both readers believed that it could be explained as benign by looking at the other modality or was extremely unlikely to persist on additional images. This aspect of the protocol was included to try to reduce recalls without decreasing the rate of cancers detected. Surprisingly, out of only 67 FFDM-only findings dismissed, two (3%) turned out to be lobular carcinomas, a positivity rate higher than that in the 440 FFDM-only findings that were worked up. The potential for bias from this result is that there may be other undetected cancers in the remaining dismissed findings. The probability that there is even one more cancer in the group is low, however, especially because all of these women have been followed for at least 1 year. Note that the dismissed findings were counted as if they had been called back. Thus the two cancers are counted as true-positives for FFDM and the remaining findings as false-positives.

The level of disagreement overall was surprisingly high. Eight hundred twenty-one examinations had discordant interpretations, which represented 17% of all examinations (821 of 4,945) and 79% (821 of 1,037) of the examinations called positive with at least one modality. To the extent that this discordance is due to differences in the modality, statistical power will be increased, but to the extent that it is due to factors independent of the modality, such as positioning and interpretation, statistical power will be decreased. It is well known from double-reading studies (2527) that there is large reader variability in the interpretation of screening mammograms.

Given the advantage in contrast resolution of FFDM and the advantage in spatial resolution of SFM, one might expect that each modality would excel at detecting different types of cancers. FFDM might be expected to be better for finding densities and masses in dense tissue, while SFM might be better for detecting calcifications. Although SFM did recall a larger number of calcification findings, even more than expected given its overall higher recall rate, the number of cancers manifesting as microcalcifications was the same for both modalities. A higher percentage of FFDM-only calcification findings was positive at biopsy. If these trends persist, then an analysis of the reasons for the discordant readings, given by the readers at the discrepancy analysis, can help to delineate whether the difference is due to FFDM being better for distinguishing benign from malignant calcifications on screening views, perhaps because of the ability to magnify the images on the workstation, or is due to superior detection of subtle calcifications with SFM.

The only other trend in the types of cancers detected was a larger percentage of architectural distortion cancers detected only with SFM. This trend is interesting, given the superior spatial resolution of SFM, because detection of architectural distortion depends on resolving fine lines in breast parenchyma. This task is likely more dependent on high spatial resolution than is the detection of microcalcifications, a task that requires only detecting a high-contrast focus, not on resolving its shape. More subjects are needed to determine whether this trend in detection of architectural distortion represents a true difference between the modalities.

The only significant result in our study is the lower recall rate of FFDM, which is caused almost entirely by a lower false-positive rate. With the use of the ROC model for evaluating a test, there are two possibilities for such a difference: either FFDM and SFM have different ROC curves, or they are being interpreted at different points along the same curve, with the operating point of SFM being to the right of that of FFDM. Although a difference in the false-positive rate without a difference in sensitivity implies that the two modalities have different ROC curves, because the power to detect a difference in sensitivity is relatively low, we cannot yet exclude a small difference in sensitivity that would explain the difference in false-positive rate as a shift along the same ROC curve.

To help distinguish operating point shifts from differences in ROC curves, ROC data were collected on a 101-point scale separately from the BI-RADS assessments and recommendations. The finding of nearly identical ROC curves for FFDM and SFM supports the belief that the difference in false-positive rate reflects a shift of operating point along the same ROC curve. Such a shift could be caused by a difference in the reading conditions between SFM and FFDM, so these conditions were kept as equivalent as possible.

One difference at one site was the participation of residents, which, anecdotally, was more likely to occur with SFM than with FFDM. The other site had no resident participation. Because resident participation was not tracked, whether the participation of residents had a noticeable effect on recall rates cannot be determined. Both sites had lower recall rates with FFDM, but the effect was larger at the site that allowed resident participation. Because of this and other possible unappreciated differences in the reading environments, whether the lower recall rate with FFDM is a property of the modality that will be observed in clinical practice or is an artifact of the experimental situation cannot be definitely determined from our data.

In conclusion, a prospective study comparing FFDM to SFM in a screening cohort has been conducted at two mammography sites. Results indicate that the two modalities have indistinguishable sensitivities for cancer detection and indistinguishable ROC curves but that FFDM has a significantly lower recall rate than SFM. Statistical power is limited by the inherently small number of cancers in a screening population. For this reason, a second phase of this study, currently underway, includes a third site and will accrue more subjects to increase statistical power. In addition, a larger multi-institutional trial with a similar design is being planned by the American College of Radiology Investigative Network.


    ACKNOWLEDGMENTS
 
We acknowledge the important contributions of Virginia Vance, RN, and Vanessa Brown, MS, as study coordinators. We are also grateful to the radiologic technologists who have pioneered digital mammography at our two institutions: Melody Lisne, Mary Heck, Cindy Hauger, Mary Benedict, Cheryl Porcaro, Laurie Theard, and Candace Kennedy.


    FOOTNOTES
 
2 Current address: Northwestern University Medical School, Chicago, Ill. Back

3 Current address: Georgetown University Medical School, Washington, DC. Back

R.E.H. and G.R.C. are consultants to GE Medical Systems, advising on issues related to FDA approval of full-field digital mammography. J.M.L., L.J.M., and P.K.I. have acted as readers in separate digital mammography research funded by GE Medical Systems as part of its submission to the FDA for approval of full-field digital mammography.

Abbreviations: BI-RADS = Breast Imaging Reporting and Data System, FFDM = full-field digital mammography, ROC = receiver operating characteristic, SFM = screen-film mammography

Author contributions: Guarantors of integrity of entire study, J.M.L., R.E.H.; study concepts, J.M.L., R.E.H., G.A.S.; study design, J.M.L., R.E.H., C.J.D., G.A.S., L.J.M., A.K., G.R.C.; definition of intellectual content, J.M.L., R.E.H., C.J.D.; literature research, J.M.L., R.E.H.; clinical studies, J.M.L., R.E.H., C.C.K., G.A.S., P.K.I., C.J.D., L.J.M., A.K.; data acquisition, all authors; data analysis, J.M.L., R.E.H., G.R.C., P.K.I.; statistical analysis, J.M.L., R.E.H., G.R.C.; manuscript preparation, J.M.L., R.E.H.; manuscript editing, J.M.L.; manuscript review and final version approval, all authors.


    REFERENCES
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 

  1. Young KC, Wallis MG, Ramsdale ML. Mammographic film density and detection of small breast cancers. Clin Radiol 1994; 49:461-465.[Medline]
  2. Robson KJ, Kotre CJ, Faulkner K. The use of a contrast-detail test object in the optimization of optical density in mammography. Br J Radiol 1995; 68:277-282.[Abstract]
  3. Yaffe MJ, Nishikawa RM, Maidment ADA, Fenster A. Development of a digital mammography system. Proc SPIE 1988; 914:182-188.
  4. Yaffe MJ, Rowlands JA. X-ray detectors for digital radiography. Phys Med Biol 1997; 42:1-39.[Medline]
  5. Karellas A, Hendrick RE. Equipment: digital mammography. In: Dempsey PJ, Monsees B, eds. Breast imaging: categorical course syllabus. Reston, Va: American Roentgen Ray Press, 1999; 1-9.
  6. Kimme-Smith C, Lewis C, Beifuss M, Williams MB, Bassett LW. Establishing minimum performance standards, calibration intervals, and optimal exposure values for a whole breast digital mammography unit. Med Phys 1998; 25:2410-2416.[Medline]
  7. Vedantham S, Karellas A, Suryanarayanan S, et al. Full-breast digital mammographic imaging with an amorphous silicon-based flat-panel detector: physical characteristics of a clinical prototype. Med Phys 2000; 27:558-566.[Medline]
  8. Hendrick RE, Berns E, Chorbajian B, et al. Low-contrast lesion detection: comparison of screen-film and full-field digital mammography (abstr). Radiology 1997; 205(P):274.
  9. Pisano ED, Yaffe MJ, Hemminger BM, et al. Current status of full-field digital mammography. Acad Radiol 2000; 7:266- 280.[Medline]
  10. Lewin JM, Hendrick RE, D’Orsi CJ, et al. Clinical evaluation of a full-field digital mammography prototype for cancer detection in a screening setting: work in progress (abstr). Radiology 1998; 209(P):238.
  11. American College of Radiology. Breast imaging reporting and data system (BI-RADS) 2nd ed. Reston, Va: American College of Radiology, 1995.
  12. Chakraborty DP, Loek HLW. Free-response methodology: alternate analysis and a new observer-performance experiment. Radiology 1990; 174:873-881.[Abstract/Free Full Text]
  13. Hanley JA, McNeil BJ. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 1982; 143:29-36.[Abstract/Free Full Text]
  14. Chakraborty DP. Maximum likelihood analysis of free-response receiver operating characteristic (FROC) data. Med Phys 1989; 16:561-568.[Medline]
  15. Hanley JA, McNeil BJ. A method of comparing the areas under receiver operating characteristic curves derived from the same cases. Radiology 1983; 148:839- 843.[Abstract/Free Full Text]
  16. Bassett LW, Hendrick RE, Bassford TL, et al. Quality determinants of mammography: clinical practice guideline no. 13. AHCPR publication 95-0632 Rockville, Md: Agency for Health Care Policy and Research, Public Health Service, U.S. Dept of Health and Human Services, 1994.
  17. Bird RE. Low-cost screening mammography: report on finances and review of 21,716 consecutive cases. Radiology 1989; 171:87-90.[Abstract/Free Full Text]
  18. Braman DM, Williams HD. ACR accredited suburban mammography center: three year results. J Fla Med Assoc 1989; 76:1031-1034.
  19. Linver MV, Paster S, Rosenberg RD, Key CR, Stidley CA, King WV. Improvement in mammography interpretation skills in a community radiology practice after dedicated teaching courses: two-year medical audit of 38,633 cases. Radiology 1992; 184:39-43.[Abstract/Free Full Text]
  20. Lynde JL. A community program of low-cost screening mammography: the results of 21,141 consecutive examinations. South Med J 1993; 86:338-343.[Medline]
  21. Robertson CL. A private breast imaging practice: medical audit of 25,788 screening and 1,077 diagnostic examinations. Radiology 1993; 187:75-79.[Abstract/Free Full Text]
  22. Sickles EA. Quality assurance: how to audit your own mammography practice. Radiol Clin North Am 1992; 30:265-275.[Medline]
  23. Siegel S. Nonparametric statistics for the behavioral sciences New York, NY: McGraw-Hill, 1956.
  24. Liang KY, Zeger SL. Regression analysis for correlated data. Annu Rev Public Health 1993; 14:43-68.[Medline]
  25. Anderson EDC, Muir BB, Walsh JS, Kirkpatrick AE. The efficacy of double reading mammograms in breast screening. Clin Radiol 1994; 49:248-251.[Medline]
  26. Thurfjell EL, Lerneval KA, Taube AAS. Benefit of independent double reading in a population-based mammography screening program. Radiology 1994; 191:241-244.[Abstract/Free Full Text]
  27. Beam CA, Sullivan DC, Layde PM. Effect of human variability on independent double reading in screening mammography. Acad Radiol 1996; 3:891-897.[Medline]



This article has been cited by other articles:


Home page
Am. J. Roentgenol.Home page
C. B. Hruska, S. W. Phillips, D. H. Whaley, D. J. Rhodes, and M. K. O'Connor
Molecular Breast Imaging: Use of a Dual-Head Dedicated Gamma Camera to Detect Small Breast Tumors
Am. J. Roentgenol., December 1, 2008; 191(6): 1805 - 1815.
[Abstract] [Full Text] [PDF]


Home page
RadiologyHome page
N. F. Jafri, R. S. Ayyala, A. Ozonoff, J. Jordan-Gray, and P. J. Slanetz
Screening Mammography: Does Ethnicity Influence Patient Preferences for Higher Recall Rates Given the Potential for Earlier Detection of Breast Cancer?
Radiology, December 1, 2008; 249(3): 785 - 791.
[Abstract] [Full Text] [PDF]


Home page
radtechHome page
J. MINIGH
Quality Assurance in Digital Mammography
Radiol. Technol., May 1, 2008; 79(5): 433M - 454M.
[Abstract] [Full Text] [PDF]


Home page
ANN INTERN MEDHome page
A. N.A. Tosteson, N. K. Stout, D. G. Fryback, S. Acharyya, B. A. Herman, L. G. Hannah, E. D. Pisano, and for the DMIST Investigators
Cost-Effectiveness of Digital Mammography Breast Cancer Screening
Ann Intern Med, January 1, 2008; 148(1): 1 - 10.
[Abstract] [Full Text] [PDF]


Home page
RadiologyHome page
P. Skaane, S. Hofvind, and A. Skjennald
Randomized Trial of Screen-Film versus Full-Field Digital Mammography with Soft-Copy Reading in Population-based Screening Program: Follow-up and Final Results of Oslo II Study
Radiology, September 1, 2007; 244(3): 708 - 717.
[Abstract] [Full Text] [PDF]


Home page
RadiologyHome page
Y. Jiang, D. L. Miglioretti, C. E. Metz, and R. A. Schmidt
Breast Cancer Detection Rate: Designing Imaging Trials to Demonstrate Improvements
Radiology, May 1, 2007; 243(2): 360 - 367.
[Abstract] [Full Text] [PDF]


Home page
Am. J. Roentgenol.Home page
P. Skaane, A. Kshirsagar, S. Stapleton, K. Young, and R. A. Castellino
Effect of Computer-Aided Detection on Independent Double Reading of Paired Screen-Film and Full-Field Digital Screening Mammograms
Am. J. Roentgenol., February 1, 2007; 188(2): 377 - 384.
[Abstract] [Full Text] [PDF]


Home page
Am. J. Roentgenol.Home page
W. T. Yang, C.-J. Lai, G. J. Whitman, W. A. Murphy Jr., M. J. Dryden, A. C. Kushwaha, A. A. Sahin, D. Johnston, P. J. Dempsey, and C. C. Shaw
Comparison of full-field digital mammography and screen-film mammography for detection and characterization of simulated small masses.
Am. J. Roentgenol., December 1, 2006; 187(6): W576 - W581.
[Abstract] [Full Text] [PDF]


Home page
Am. J. Roentgenol.Home page
H. H. Kim, E. D. Pisano, E. B. Cole, M. R. Jiroutek, K. E. Muller, Y. Zheng, C. M. Kuzmiak, and M. A. Koomen
Comparison of calcification specificity in digital mammography using soft-copy display versus screen-film mammography.
Am. J. Roentgenol., July 1, 2006; 187(1): 47 - 50.
[Abstract] [Full Text] [PDF]


Home page
Am. J. Roentgenol.Home page
G. Schueller, E. Kaindl, W. K. Matzek, F. Semturs, C. Schueller-Weidekamm, and T. H. Helbich
Image Quality of a Wet Laser Printer Versus a Paper Printer for Full-Field Digital Mammograms
Am. J. Roentgenol., January 1, 2006; 186(1): 38 - 43.
[Abstract] [Full Text] [PDF]


Home page
Radiat Prot DosimetryHome page
C. V. Ongeval, H. Bosmans, and A. Van Steen
Current challenges of full field digital mammography
Radiat Prot Dosimetry, December 1, 2005; 117(1-3): 148 - 153.
[Abstract] [Full Text] [PDF]


Home page
NEJMHome page
D. D. Dershaw
Film or digital mammographic screening?
N. Engl. J. Med., October 27, 2005; 353(17): 1846 - 1847.
[Full Text] [PDF]


Home page
Am. J. Roentgenol.Home page
T. K. Nishino, X. Wu, and R. F. Johnson Jr.
Thickness of Molybdenum Filter and Squared Contrast-to-Noise Ratio per Dose for Digital Mammography
Am. J. Roentgenol., October 1, 2005; 185(4): 960 - 963.
[Abstract] [Full Text] [PDF]


Home page
RadiologyHome page
P. Skaane, C. Balleyguier, F. Diekmann, S. Diekmann, J.-C. Piguet, K. Young, and L. T. Niklason
Breast Lesion Detection and Classification: Comparison of Screen-Film Mammography and Full-Field Digital Mammography with Soft-copy Reading--Observer Performance Study
Radiology, October 1, 2005; 237(1): 37 - 44.
[Abstract] [Full Text] [PDF]


Home page
RadiologyHome page
S. Suryanarayanan, A. Karellas, S. Vedantham, S. M. Waldrop, and C. J. D'Orsi
Detection of Simulated Lesions on Data-compressed Digital Mammograms
Radiology, July 1, 2005; 236(1): 31 - 36.
[Abstract] [Full Text] [PDF]


Home page
JAMAHome page
J. G. Elmore, K. Armstrong, C. D. Lehman, and S. W. Fletcher
Screening for Breast Cancer
JAMA, March 9, 2005; 293(10): 1245 - 1256.
[Abstract] [Full Text] [PDF]


Home page
RadiologyHome page
D. Gur
Technology and Practice Assessment: In Search of a "Desirable" Statement
Radiology, March 1, 2005; 234(3): 659 - 660.
[Full Text] [PDF]


Home page
RadiologyHome page
E. D. Pisano and M. J. Yaffe
Digital Mammography
Radiology, February 1, 2005; 234(2): 353 - 362.
[Abstract] [Full Text] [PDF]


Home page
RadioGraphicsHome page
M. Mahesh
AAPM/RSNA Physics Tutorial for Residents: Digital Mammography: An Overview
RadioGraphics, November 1, 2004; 24(6): 1747 - 1760.
[Abstract] [Full Text] [PDF]


Home page
RadiologyHome page
P. Skaane and A. Skjennald
Screen-Film Mammography versus Full-Field Digital Mammography with Soft-Copy Reading: Randomized Trial in a Population-based Screening Program--The Oslo II Study
Radiology, July 1, 2004; 232(1): 197 - 204.
[Abstract] [Full Text] [PDF]


Home page
RadiologyHome page
W. T. Yang, G. J. Whitman, M. M. Johns