|
|
||||||||
Thoracic Imaging |
1 From the Departments of Radiology (D.D.M., W.T.M.Jr., J.M.A., W.B.G., W.T.M.Sr.) and Medicine, Pulmonary and Critical Care Division (R.M.K., G.T.), Hospital of the University of Pennsylvania, 1st Floor Silverstein, 3400 Spruce St, Philadelphia, PA 19104. From the 1997 RSNA scientific assembly. Received February 3, 1998; revision requested April 9; final revision received October 12; accepted December 16. Address reprint requests to W.T.M.Jr. (e-mail: miller2@rad.upenn.edu).
| Abstract |
|---|
|
|
|---|
MATERIALS AND METHODS: The preoperative chest radiographs obtained in 57 patients who had undergone lung volume reduction surgery were retrospectively scored by five blinded readers for severity and distribution of emphysema, evidence of lung compression, disease heterogeneity, and other features. Comparisons were made with the 36-month postoperative functional outcome for each patient.
RESULTS: High disease heterogeneity (score >2) and unequivocal lung compression (score 1) both were 100% predictive of a favorable outcome (FEV1 increase,
30%). Low heterogeneity (score <1) was 94% predictive of an unfavorable outcome (FEV1 increase <30%), as was a lack of lung compression, which was 92% predictive of an unfavorable outcome. These two features also correlated with an improved 6-minute walk test result, although this correlation was weaker.
CONCLUSION: Chest radiography alone may be sufficient for initial screening. High disease heterogeneity and lung compression on chest radiographs are highly predictive of a favorable functional outcome.
Index terms: Emphysema, pulmonary, 60.751 Lung, diseases, 60.751 Lung, surgery Lung, ventilation, 60.1295
| Introduction |
|---|
|
|
|---|
In lung volume reduction surgery, which was described by Brantigan et al (3) in 1959, the overall lung volume is reduced by means of multiple nonsegmental wedge resections aimed at the most diseased areas of the lung when possible, with the purpose of decreasing the total lung volume by approximately 20%30%. The theoretic basis of the procedure is that returning the lung to a more physiologic total volume may promote elastic recoil and enhance diaphragmatic function (4).
Lung volume reduction surgery has been shown to improve pulmonary function test results, exercise tolerance, and quality of life in many patients (5). However, the functional outcomes in most centers have varied, and it has been suggested that the outcome differences may be predicted preoperatively by using a variety of clinical, laboratory, and radiographic parameters (58). We set out to determine whether the functional outcome of patients who underwent lung volume reduction surgery at our institution could have been reliably predicted by using various objective findings on these patients' preoperative chest radiographs alone.
| MATERIALS AND METHODS |
|---|
|
|
|---|
Clinical Evaluation of Outcome
Pulmonary function testing, including spirometry and lung volume measurement, was performed preoperatively and 36 months postoperatively by using standard techniques (9). Body plethysmography was used to measure lung volumes because it is regarded by many to be the most accurate method. Exercise tolerance was assessed with the standardized 6-minute walk test, in a flat hallway without coaching, by using titrated supplemental oxygen to maintain the oxyhemoglobin saturation level at 90% or higher.
For outcome analysis, the percentage changes in FEV1 and 6-minute walk test results were used, as opposed to the absolute values, to better standardize the relative magnitude of improvement or deterioration between patients with different baseline values.
Radiographic Evaluation and Scoring
Standard inspiratory posteroanterior and lateral chest radiographs were phototimed at 120 kVp, with use of a fixed grid and a wide-latitude screen-film system (Eastman-Kodak, Rochester, NY).
Radiographs were read and scored by five readers independently on different dates. Four of the readers (W.T.M.Jr., W.T.M.Sr., J.M.A., W.B.G.) were subspecialized thoracic radiologists, with levels of experience ranging from 7 to 40 years. The fifth reader (D.D.M.) was a diagnostic radiology resident. It was felt that these differing experience levels might provide insight into the variability and reproducibility of scoring. All of the radiologists were blinded to each patient's ultimate functional outcome. No time constraint for reading and scoring the images was provided. Individual cases and scoring were not discussed between radiologists.
No standard scores were provided, but the radiologists were asked to score the images as follows: A lung compression score of 1 (positive) was given if there were any signs of vascular crowding or atelectasis; otherwise a score of 0 (negative) was given. The severity of emphysema in various zones of each lung was scored on a scale of 03 as follows: 0, normal lung; 1, mild disease (<25% airspace replacement); 2, moderate disease (25%75% airspace replacement); and 3, severe disease (>75% airspace replacement). Each lung was divided into six upper-, middle-, and lower-third zones and into medial versus lateral (central vs peripheral) divisions (Figs 1, 2). An overall severity score was calculated by adding the scores in the 12 individual zones.
|
|
|
|
Two examples of scoring are shown in Figures 1 and 2. Figure 1 shows very homogeneous emphysema; two radiologists gave all zones a score of 2, and three radiologists gave all zones a score of 3. This resulted in a heterogeneity score of 0 by all radiologists. No vascular crowding or atelectasis was present, so lung compression was scored as 0 (negative) by all five radiologists. Figure 2 demonstrates the lungs of a patient with heterogeneous disease. All five readers gave the right upper zones a score of 3 and the right base of the lung a score of either 0 or 1. This resulted in high heterogeneity scores of 3 from four readers and 2 from one reader. This patient's radiograph was given a lung compression score of 1 (positive) by all five radiologists, because there was vascular crowding just inferior to the severely diseased right upper lobe.
Several other parameters were scored by two of the radiologists (W.T.M.Jr., D.D.M.). Hyperinflation was initially assessed by using two quantitative measurements. First, the angles of the anterior and posterior costophrenic sulci were measured in degrees and averaged to obtain the diaphragm angle score. Second, the anteroposterior/longitudinal chest wall diameter was calculated by dividing the maximal anteroposterior diameter, from pleural surface to pleural surface, by the maximal longitudinal dimension, from the apex to the dome of the higher hemidiaphragm on the lateral image. Finally, a subjective hyperinflation score of 0 (normal), 1 (mild), 2 (moderate), 3 (severe), or 4 (inversion of diaphragms) was assigned. The percentage of normal or minimally diseased lung also was estimated in each patient by these two radiologists. The other three readers were not asked to score these additional features, because the first two readers found them to be overly cumbersome and time-consuming to score.
Statistical Analyses
The Student t test, with a heteroscedastic one-tailed distribution, was used to compare scores between groups of patients, specifically those patients with a favorable outcome versus those who did poorly. A favorable outcome was defined as an increase of 30% or greater in postoperative FEV1 over the preoperative baseline value, and a poor outcome was defined as a less than 30% increase in FEV1. The 30% cutoff was chosen because it was thought to be a change that was not only reproducible and beyond the range of measurement error but also likely to represent a clinically important improvement. The nonsurvivors were excluded from analysis. A P value of less than .05 was considered to be statistically significant.
The predictive values, sensitivity, and specificity of the various predictive features studied were determined. In addition, Pearson correlation coefficients were used to test the correlation between various scored features and functional outcome. Tests for statistical significance and correlation coefficients were performed for each radiologist individually. In addition, statistical analyses were performed to determine the composite (mean) scores of all five radiologists. Interobserver variability was assessed by using receiver operating characteristic analysis and comparing the individual radiologists' scores and P values.
| RESULTS |
|---|
|
|
|---|
30% FEV1 increase) consisted of 25 patients with a mean increase in FEV1 of 73.3% ± 26.8. The mean 6-minute walk test measurements for the two groups were 29.2% ± 43.5 and 30.3% ± 19.6, respectively, which were not significantly different.
As seen in Table 1, there was a highly significant difference in the mean heterogeneity score between the two outcome groups. Those with a poor outcome generally demonstrated a low heterogeneity score, in contrast to those with a favorable outcome, who consistently demonstrated high heterogeneity. Similarly, those patients who did favorably tended to have higher heterogeneity as measured by the peripheral zone versus central zone disease severity scores; these differences also were statistically significant. Table 2 shows the outcomes of patients with specific radiographic heterogeneity scores. Note that of those patients (n = 17) with very low heterogeneity (mean score <1), 16 (94%) did unfavorably, with a <30% increase in FEV1. Similarly, by using the 6-minute walk test result as a measure of outcome, 15 (88%) of the 17 patients with a heterogeneity score of less than 1 were unable to increase their 6-minute walk test measurement to 30% or more over the baseline. On the contrary, all 15 (100%) of the patients with very high heterogeneity (mean score
2) did favorably, demonstrating a 30% or greater increase in FEV1. By using the 6-minute walk test result as a measure of outcome, seven (47%) of the 15 patients whose radiographs were given a heterogeneity score of 2 or higher experienced an increase in the 6-minute walk test measurement of 30% or more over the baseline.
|
|
As shown in Table 1, the estimated percentage of normal lung, overall score of emphysema severity, and the various measures of hyperinflation were not significantly different between the two outcome groups.
Table 3 shows the Pearson correlation coefficients for each of the measured features in relation to FEV1 and 6-minute walk test. Again, both heterogeneity and lung compression correlated strongly with increased FEV1. Interestingly, these two features had a weaker correlation with the 6-minute walk test; however, as the data above demonstrate, a definite correlation between 6-minute walk test and these two features was present. The calculated overall disease severity showed no significant correlation with outcome.
|
In contrast, interreader variability of heterogeneity and lung compression was minimal. Figure 3 illustrates the results of receiver operating characteristic analysis of the five independent radiologists' scores for heterogeneity, including the Azvaluethat is, area under the receiver operating characteristic curvefor each radiologist. Receiver operating characteristic analysis could not be performed for lung compression, because this variable was discrete (ie, noncontinuous). However, there was unanimous agreement on compression in 35 of 57 cases and agreement among four of the five radiologists on another 10 of 57 cases. Put more simply, there was agreement on lung compression among at least four of the five readers in 79% of the cases.
|
| DISCUSSION |
|---|
|
|
|---|
The work of Slone and colleagues (5,8) has provided an important foundation for the use of imaging techniques to help predict the outcome from lung volume reduction surgery. The results of their studies, which involved the use of inspiration and expiration chest radiography, chest computed tomography (CT), and nuclear medicine ventilation-perfusion scanning, suggest that high disease heterogeneity, evidence of lung compression, and higher residual normal lung are predictive of a favorable outcome. Our data confirm many of their conclusions; however, the results of our study suggest that conventional chest radiography alone is highly predictive, without the need for additional expensive or invasive imaging modalities.
The results of this study show that high disease heterogeneity on standard chest radiographs is strongly predictive of a favorable outcome following lung volume reduction surgery. Specifically, we found that a very high heterogeneity score (
2) virtually guaranteed a favorable result (positive predictive value of 100%) in our patient population. In contrast, an extremely low heterogeneity score (<1) strongly correlated with a poor outcome and thus should be helpful in patient selection. In our study population, this criterion alone would have excluded 17 (30%) patients from eligibility for the procedure; only one could have done well (positive predictive value of 94%), as indicated by an increase in FEV1 of 30% or more over the baseline.
We did not provide disease severity score standards for our radiologists, and thus the disease severity scores of individual lung zones varied among the different radiologists. However, the relative differences in disease severity between lung zones, which is what determined the heterogeneity, was very consistent between readers, as demonstrated by the highly concordant receiver operating characteristic curves. Thus, absolute standards may not be necessary for determining heterogeneity, because an internal standard of sorts exists within each case. However, standards may be of value for achieving better consistency in scoring overall disease severity (vs residual normal lung). This feature, often referred to as normal pulmonary reserve, is thought by some investigators (5) to be predictive of surgical mortality. If it were important to get a more consistent overall severity from reader to reader, a scoring strategy in which individual lung zones are scored as 0 for normal or near normal lung and as 1 for advanced disease could be devised. With such a strategy, one could expect much less interreader variability among individual and overall severity scores, but this might be at the expense of a loss in the predictive value of the heterogeneity score.
Lung compression should also be helpful in patient selection for lung volume reduction surgery. In our population, if those patients who indisputably had no lung compression (n = 25) had been denied lung volume reduction surgery, we would have prevented 44% of the candidates from undergoing this expensive operation and excluded only two who would have done well (positive predictive value of 92%). In contrast, the finding of indisputable lung compression provided a virtual guarantee of a favorable outcome (positive predictive value of 100%) in our population.
Our decision to use a 30% increase in FEV1 as the cutoff for favorable versus unfavorable outcome may be subject to criticism. We believe, however, that this cutoff represents a magnitude of improvement that is well beyond the error of measurement of the test and, hence, statistically meaningful and reproducible. We also believe that this magnitude of change represents a clinically relevant improvement. Ultimately, it is necessary for individual surgeons and pulmonologists to decide which threshold represents a functional improvement that is substantial enough to warrant surgery at their particular center.
Although in our study the radiographic features of heterogeneity and lung compression were highly predictive of a postoperative increase in FEV1, their correlation with the 6-minute walk time was somewhat weaker. Some have argued that the 6-minute walk time is a better functional evaluation than is FEV1, which is more of a physiologic measure (5). In our population, the two measures did not correlate strongly. There is no clear explanation for this poor correlation, but it has been reported by other investigators (9,11,12). It may be because lung volume reduction surgery probably influences physiologic parameters that are not reflected in pulmonary function tests, such as gas exchange, cardiac function, and perceived dyspnea (2,9). There is probably no single test that best evaluates the overall outcome following lung volume reduction surgery. Inasmuch as we relied on FEV1 as our primary measure of outcome, this may be considered a limitation of the study. However, the FEV1 measurement is reproducible, standardized, and, in our opinion, not as prone to be influenced by motivation and psychologic factors. In addition, FEV1 is one of the few objective measures that has been shown to correlate with mortality caused by emphysema (13).
Some may consider it a mistake to have included a resident as one of the five readers. However, we believed it was necessary to include one reader who was not subspecialized and, in fact, was considerably less experienced to see how a less specialized and less experienced reader would perform in the model. The resident's scores were not significantly different from those of the other four radiologists (Fig 3), which suggests that readings by general radiologists or experienced clinician readers may be as consistent and predictive of outcome in our model as those by subspecialized thoracic radiologists.
Twelve (10%) of the 120 patients in our initial population died within 90 days after the procedure. These patients were not included in the analysis. All 12 of the patients who died succumbed to infection or sepsis, which are complications that are not predictable by using preoperative radiography. The high mortality in this series may reflect the steep learning curve of performing lung volume reduction surgery; these were the first 120 patients to undergo this procedure at our institution. However, this was not a major limitation of our study, because mortality was not examined as an end point.
Roughly one-half of the patients who underwent lung volume reduction surgery at our institution during the study period had preoperative radiographs that were available for our analysis. Although this is a limitation of our study, it should be noted that the functional outcomes in our study group of 57 patients were virtually identical to those in the series studied by Kotloff et al (9). Specifically, one-third of our study patients demonstrated an increase of less than 20% over the baseline FEV1, one-third demonstrated a 20%60% increase in FEV1, and one-third demonstrated a greater than 60% increase in FEV1.
It seems counterintuitive that hyperinflation was not predictive, particularly given that the basis of the lung volume reduction procedure is a reduction in hyperinflation in the chest. This may derive from the fact that most patients who are chosen for lung volume reduction surgery already have substantial hyperinflation, and this may be why the variance in hyperinflation scores was small. However, an equally plausible explanation is that the findings of disease heterogeneity and lung compression are much more powerful predictors of outcome than is hyperinflation, and hyperinflation would have shown a predictive value with greater statistical power in a study with more patients.
Other investigators (1416) have recently evaluated the role of other imaging modalities such as CT and ventilation-perfusion scanning in the preoperative examination of patients who are being considered for lung volume reduction surgery. The results of our study have shown that the findings on chest radiographs alone may be highly predictive of outcome in this population. Although further studies on the radiographic evaluation for lung volume reduction surgery are needed, the preliminary data on the radiographic prediction of outcome are compelling. Clearly, radiography will have an important role in patient selection for this promising but costly approach to the management of advanced emphysema in the years to come.
| Footnotes |
|---|
Abbreviation: FEV1 = forced expiratory volume in 1 second
Author contributions: Guarantor of integrity of entire study, D.D.M.; study concepts and design, D.D.M., W.T.M.Jr.; definition of intellectual content, D.D.M., W.T.M.Jr.; literature research, D.D.M., W.T.M.Jr.; clinical studies, all authors; data acquisition, D.D.M.; data and statistical analyses, D.D.M., W.T.M.Jr.; manuscript preparation, D.D.M., W.T.M.Jr.; manuscript editing and review, all authors.
| References |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
B. M. Trotta, A. V. Stolin, M. B. Williams, S. B. Gay, A. S. Brody, and T. A. Altes Characterization of the Relation Between CT Technical Parameters and Accuracy of Quantification of Lung Attenuation on Quantitative Chest CT Am. J. Roentgenol., June 1, 2007; 188(6): 1683 - 1690. [Abstract] [Full Text] [PDF] |
||||
![]() |
H O Coxson, K P Whittall, Y Nakano, R M Rogers, F C Sciurba, R J Keenan, and J C Hogg Selection of patients for lung volume reduction surgery using a power law analysis of the computed tomographic scan Thorax, June 1, 2003; 58(6): 510 - 514. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. F. Gelb and R. J. McKenna Jr Lung Volume Reduction Surgery Update Chest, April 1, 2003; 123(4): 975 - 977. [Full Text] [PDF] |
||||
![]() |
K. Cederlund, U. Tylen, L. Jorfeldt, and P. Aspelin Classification of Emphysema in Candidates for Lung Volume Reduction Surgery* : A New Objective and Surgically Oriented Model for Describing CT Severity and Heterogeneity Chest, August 1, 2002; 122(2): 590 - 596. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. M. Kotloff, J. Hansen-Flaschen, D. A. Lipson, G. Tino, S. M. Arcasoy, A. Alavi, and L. R. Kaiser Apical Perfusion Fraction as a Predictor of Short-term Functional Outcome Following Bilateral Lung Volume Reduction Surgery Chest, November 1, 2001; 120(5): 1609 - 1615. [Abstract] [Full Text] [PDF] |
||||
![]() |
National Emphysema Treatment Trial Research Group Patients at High Risk of Death after Lung-Volume-Reduction Surgery N. Engl. J. Med., October 11, 2001; 345(15): 1075 - 1083. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. R. Flaherty, E. A. Kazerooni, J. L. Curtis, M. Iannettoni, L. Lange, M. A. Schork, and F. J. Martinez Short-term and Long-term Outcomes After Bilateral Lung Volume Reduction Surgery : Prediction by Quantitative CT Chest, May 1, 2001; 119(5): 1337 - 1346. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. D. Maki, W. B. Gefter, and A. Alavi Recent Advances in Pulmonary Imaging Chest, November 1, 1999; 116(5): 1388 - 1402. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. H. M. Austin Pulmonary Emphysema: Imaging Assessment of Lung Volume Reduction Surgery Radiology, July 1, 1999; 212(1): 1 - 3. [Full Text] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| RADIOLOGY | RADIOGRAPHICS | RSNA JOURNALS ONLINE |