Radiology
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Abstract Freely available
Right arrow Figures Only
Right arrow Full Text (PDF)
Right arrow Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when eLetters are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Smyth, P. P.
Right arrow Articles by Adams, J. E.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Smyth, P. P.
Right arrow Articles by Adams, J. E.
(Radiology. 1999;211:571-578.)
© RSNA, 1999


Technical Developments

Vertebral Shape: Automatic Measurement with Active Shape Models1

Paul P. Smyth, PhD, Christopher J. Taylor, PhD and Judith E. Adams, FRCP, FRCR

1 From the Departments of Medical Biophysics (P.P.S., C.J.T.) and Diagnostic Radiology (J.E.A.), University of Manchester, Oxford Rd, Manchester, M13 9PT, England. From the 1997 RSNA scientific assembly. Received January 6, 1998; revision requested March 18; revision received August 11; accepted October 26. Address reprint requests to J.E.A.


    Abstract
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 References
 
The shape and appearance of the spine on lateral dual x-ray absorptiometry scans were statistically modeled. To measure vertebral shape accurately, rapidly, and automatically with a computer, this trained model was matched to findings on previously unseen scans. The technique obtained entire shape information, was faster than manual analysis, and was as accurate as human observers in the measurement of vertebral shape.

Index terms: Bones, absorptiometry, 30.1295 • Computers, diagnostic aid, 30.1295 • Images, analysis, 30.1295 • Images, interpretation, 30.1295


    Introduction
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 References
 
Osteoporosis is a condition characterized by a reduction in bone mass, which results in decreased physical strength of the skeleton and increased risk of fractures. The most common sites for such fractures are the hip (femoral neck), vertebrae, and wrist (distal radius). Of these, hip fractures are the most serious for the patient in terms of both morbidity and mortality (1) and the burden placed on national health care systems (2).

There are two common forms of osteoporosis (3). The first is postmenopausal osteoporosis (type I), which affects women after menopause, when the level of production of hormones such as estrogen declines. Type I osteoporosis leads to a loss of spongy trabecular bone, which is very sensitive to changes in skeletal metabolism. This loss of trabecular bone leads to vertebral and wrist fractures. Senile (type II) osteoporosis affects both women and men aged 70 years and older and arises from a loss of cortical (compact) and trabecular bone, which often leads to hip fractures.

The aim of most preventions and treatments for osteoporosis is to reduce the chance that patients will experience fragility fractures, particularly of the hip, in the remainder of their lives. In trials of new treatments, efficacy must be established in as short a time as possible. Vertebral fractures are more common than are hip fractures, and they occur in younger patients. As a consequence, prevalent vertebral fractures are often used as an entry criterion into some drug trials for osteoporosis, and incident vertebral fracture during treatment is used to assess the efficacy of therapy.

In the clinical environment, vertebral fractures are conventionally detected on lateral spinal radiographs by a radiologist, who decides subjectively if any of the patient's vertebrae appear fractured. For efficacy trials of treatments, in which changes in vertebral shape must be detected, however, a more quantitative approach is necessary.

There are two categories or methods for quantitative assessment of vertebral fracture. First is the standardized manual approach (often referred to as "semiquantitative") (4), in which a radiologist classifies each vertebra by means of visual examination of the radiograph according to strict definitions of degree and type of fracture. Three types of fracture—wedge, end plate, and crush—are recognized. In the second method, six (or more) points are placed manually in the vertebral margins on radiographs or dual x-ray absorptiometry (DXA) scans. On the basis of these points, anterior, middle, and posterior vertebral heights can be calculated.

Current quantitative methods have three limitations. First, the manual placement of points on the vertebrae is subjective, which introduces variability into the process of detecting vertebral fractures. Second, manual point placement is laborious and time-consuming, requiring 15 minutes or more to analyze the radiographs or DXA scans obtained in one patient. Third, the six points marked on each vertebra might not completely describe its shape, which may change more subtly with osteoporosis than can be described simply. Detection of vertebral fractures on serial images is important; such detection is particularly sensitive to inconsistencies in point placement.

The technique described herein is intended to eliminate the three problems associated with manual measurement of vertebral shape. We used an active shape model (ASM) (5,6), a general method from the field of computer vision, to locate and measure the shapes of variable objects in images. An ASM automatically and quickly measures the shape of the full vertebral outline rather than the shape at only the six points. Our intention was to extract more information from the images in a way that is more rapid, less expensive, and more accurate than is possible with current manual methods. Because DXA images are obtained digitally and do not experience the projection distortion seen on conventional radiographs, the ASM was applied to the measurement of vertebral shape on lateral DXA scans of the spine.

With use of DXA images obtained in postmenopausal women, the accuracy and precision in locating the full vertebral outline were compared between the automatic method and human operators using the standard six-point technique.

Vertebrae with osteoporotic fractures are the most relevant clinically; therefore, the automatic technique must be capable of measuring their shape. Our evaluation of the method was performed with a small set of DXA scans of the lumbar vertebrae obtained in women with osteoporosis who had experienced vertebral fractures.

In addition, we investigated the possible advantages of a more complete description of vertebral shape. We evaluated the ability of a statistical classifier to distinguish fractured from normal vertebrae by using various vertebral shape descriptions. We compared shape descriptions based on vertebral height, the positions of the six manually placed points, and the full vertebral outline.


    Materials and Methods
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 References
 
The ASM
An ASM (5,6) is a statistical model that describes "what an object looks like" in terms of its shape and its imaging appearance. An ASM is created by "training" it with sample images on which the boundaries of the objects of interest have been annotated by an expert. After the ASM is trained, it can be used to locate the objects on a new image by matching the model, which describes the expected shape and appearance, to findings on the new image.

An ASM is a type of deformable template model that does not just describe a single fixed object shape and appearance but also describes the ways in which the objects were observed to vary in shape and appearance over the set of training images. ASMs have an important advantage over other methods for locating objects on images because they are specific to the type of object under study and describe only the variation observed in the training examples and do not allow "illegal" variation. ASMs have been applied successfully to a range of medical image interpretation problems in two and three dimensions (5,7).

An ASM contains two separate components that describe the object shape and appearance. Object shape is described by means of a point distribution model (PDM), which is generated by performing statistical analysis of the object shapes observed over the set of training images. The contour around the structure on each training image is described with a set of n landmark points, which are manually annotated by an expert. Each contour can then be described with a vector, x = {x1, y1, x2, y2, . . ., xn, yn}, where (xi, yi) is the position of the ith landmark point on the contour. The training contours are aligned as closely as possible by means of scaling, rotation, and translation. Then, to describe the main independent ways in which the training shapes vary, principal component analysis (8) is performed by using the deviations of each training shape vector from the mean shape vector .

The PDM represents shape in terms of a mean shape and a set of linearly independent modes of shape variation that describe the variation over the training set. Use of only the most common modes of shape variation is required to describe members of the training set to a chosen level of accuracy.

Any new shape, x, of the same type as those observed in training can then be generated by adding combinations of the modes of variation contained in a matrix P to the mean shape, , with a vector of weights b controlling the influence of each mode: x = + Pb.

A PDM of the L1 vertebra represented with 73 landmark points is shown in Figure 1. This shows the mean shape of the vertebra and the three most common modes of variation, or ways in which its shape varied from the examples with which the model was trained. As the shape was varied between the mean shape and the 3-SD limits of each mode of variation, the shape model remained anatomically plausible as a vertebra. The PDM did not allow unrealistic shapes to be generated, which is unique among shape modeling methods, to our knowledge.



View larger version (19K):
[in this window]
[in a new window]
 
Figure 1. Schematic of the PDM of the shape of the L1 vertebra shows the mean shape and the limits of the first three modes of shape variation (A, B, C), which describe the most common ways in which vertebral shape can vary. The range of shapes allowed with the model always remains realistic. 1, 2, 3 = rank order of mode of variation.

 
The image appearance of the vertebra is modeled by analyzing the gray-level image profile at each landmark point in a direction perpendicular to the object contour. Typical gray-level (image brightness) profiles for the anterior and inferior vertebral margins are shown in Figure 2. At the anterior margin, there is an increase in image brightness, while at the inferior margin, the image brightness is greatest on the contour that has been marked. These profiles are collected for all 73 landmark points along the vertebral contour for each training example. The profile at one landmark point is described as a vector. Then, principal component analysis is performed of the set of training profiles for that landmark point to describe the likely profile for that landmark point as a mean profile plus a set of modes of profile variation. These gray-level profile models are capable of modeling any type of profile appearance, and they are derived purely from the annotated training images. Full details about the training and use of these combined gray-level and shape models have been published previously (6).



View larger version (24K):
[in this window]
[in a new window]
 
Figure 2. Typical gray-level profiles for a vertebra imaged with DXA.

 
Once shape and appearance models have been constructed, the ASM can be used to locate the modeled object on new images. An initial approximation to the location of the object is projected onto the image and iteratively refined. The image brightness around the current position of each landmark point in the shape model is probed for gray-level evidence that better matches the profile model, from which a new position for each landmark point is suggested. The PDM then attempts to deform itself to fit to the new suggested positions of each landmark point within the constraints imposed by its modes of variation. The process of searching for gray-level evidence and deforming the shape model is repeated until the ASM converges to a solution, as it is drawn toward the true location of the vertebra. Because the PDM imposes tight shape constraints, only objects with a shape similar to those observed in the training set (ie, realistic vertebrae) can be located on images, which makes the search robust to image noise and clutter. Because the gray-level profile models are specific to the profiles observed on the training images, only objects with an appearance similar to those observed in training will be located, which further improves robustness.

Use of a multiresolution approach, in which gray-level models are trained on images that range from low to high spatial resolution, was found to improve the speed and robustness of image search (9). We used this approach in our experiments.

Training
Lateral spine DXA images were obtained in 78 women (mean age, 61 years; age range, 44–80 years) with a DXA scanner (model QDR2000plus; Hologic, Waltham, Mass). The women were selected randomly from those who underwent bone densitometry at a local clinic (scans courtesy of Ian Smith, MD). The dimensions of the pixels in the lateral scans were 0.9 x 1.0 mm. With use of a computer and a mouse pointer, one of the authors (P.P.S.) marked 73 landmark points per vertebra around the full contours of 10 vertebrae on each image, with the advice of an experienced radiologist (J.E.A.).

Instead of the single vertebra used in the illustrative examples described previously, six thoracic vertebrae, from T7 to T12, and four lumbar vertebrae, from L1 to L4, were included in the model of the spine. The three most common modes of shape variation obtained with the resultant spine PDM are shown in Figure 3. The schematics illustrate the ways in which the shape of the spine and vertebrae varied between individuals in the training set.



View larger version (15K):
[in this window]
[in a new window]
 
Figure 3. PDM images show the three most common modes of normal spine shape variation. Top: Normal spinal movement. Middle: Kyphosis. Bottom: Strong correlation in vertebral heights (arrows) across vertebral levels.

 
The two most common modes of shape variation describe normal spinal movement and kyphosis. The third mode of variation describes a simultaneous change in all vertebral heights, indicating that, although there is wide natural variation in vertebral shape between individuals, the shapes of different vertebrae tend to vary together between individuals with normal vertebrae. This information has been obtained purely by means of statistical analysis of the training shapes, without any prior knowledge about the ways in which vertebral shapes might vary.

Image Search
The image search was performed in two phases. First, the ASM of the spine (the PDM) was initialized roughly near the true spine location. The ASM then deformed itself so that the gray-level profile models were best matched to the edges and ridges in the image that corresponded to the vertebrae.

The image search was initialized by manually placing three points—which corresponded to T7, T12, and L4—on the image (Fig 4a). The spine ASM was then fitted to these points (Fig 4b), and the search was started. The ASM moved and deformed as it was drawn toward the vertebrae. The result of a typical ASM search is shown in Figure 4c.



View larger version (61K):
[in this window]
[in a new window]
 
Figure 4a. ASM search on DXA images, initialized with three manually marked points. (a) Normal DXA scan, with manually placed points marked. (b) Starting ASM search position, initialized with three points. (c) Final ASM search position, after convergence.

 


View larger version (64K):
[in this window]
[in a new window]
 
Figure 4b. ASM search on DXA images, initialized with three manually marked points. (a) Normal DXA scan, with manually placed points marked. (b) Starting ASM search position, initialized with three points. (c) Final ASM search position, after convergence.

 


View larger version (63K):
[in this window]
[in a new window]
 
Figure 4c. ASM search on DXA images, initialized with three manually marked points. (a) Normal DXA scan, with manually placed points marked. (b) Starting ASM search position, initialized with three points. (c) Final ASM search position, after convergence.

 
Comparison of ASM Accuracy and Manual Reproducibility
We performed a set of automatic search experiments to provide an unbiased estimate of the errors to be expected with the automatic method in real clinical use. The search process was performed on scans that had not been included in the set of images with which the model had been trained, ensuring that each scan was treated as unseen. This test was performed as a "leave-one-out" or cross-validation experiment (10), in which the ASM was trained with all training images but one. ASM performance was then tested on the excluded image. This train-and-test process was repeated in turn with each example in the training set that had been left out, which was then used as the test example.

Each search experiment was performed 20 times for each image, with the positions of the three manual starting points randomly altered to simulate human operator variation. Precision values were obtained between these 20 repetitions, which represented the degree to which the final solution was susceptible to misplacement of the initial manual points. If the precision errors were small, the technique could be claimed to be operator independent.

The search results were characterized as a distribution of successfully located cases and some failures. If the error in automatically locating a vertebra was greater than 3 SDs from the mean of the distribution of successfully located cases, the search for that vertebra was classified as a failure. The accuracy of ASM vertebral location was measured with the root-mean-square of the point-to-line distance errors from each point on the "true" vertebra contour (as annotated during training) to the nearest point on the ASM-located contour (Fig 5, part A).



View larger version (15K):
[in this window]
[in a new window]
 
Figure 5. A, Measurement of point-to-line error from a true contour position to a proposed search solution. B, Measurement of manual reproducibility. Dashed lines show directions in which errors are measured. Shaded areas show distribution of manually marked points.

 
To compare the accuracy of vertebral location with ASM search to that with a standard manual method, a random subset of 40 of the 78 images were annotated by four trained operators (P.P.S. and others) by using the standard six-point method (11). A measure that described the reproducibility of the manual method was devised. In the six-point marking scheme, only the vertical component of error in the placement of vertebral midpoints is important. In placement of vertebral corners, however, both directional components of error are important (Fig 5, part B). Reproducibility for each of the six points was measured as the variance between the placements of the four operators, ignoring the vertical component of variation in midpoint placement. Reproducibility for each vertebra was calculated as the root-mean-square of the reproducibilities of each of the six points.

The measures that describe the performance of automatic and manual techniques are not identical, so they should be compared with caution. The accuracy of the automatic method has been compared with the reproducibility of the manual method, which is a lower limit on its accuracy, as repeat measurements may be clustered away from the true location, yielding good precision but poor accuracy. Therefore, if the automatic accuracy is found to be less than the manual reproducibility, the automatic technique can be said to have performed more accurately.

Automatic Morphometry of Fractured Vertebrae
Lumbar DXA scans were acquired in 32 patients with at least one lumbar vertebral fracture (scans courtesy of Peter Steiger, PhD). Landmark points were manually marked on the four lumbar vertebrae L1–L4. For these scans, both the external vertebral contours and the inner contours of the upper and lower vertebral end plates were modeled to allow better detection of end plate vertebral fractures, in which the inner contours of the end plates become distorted.

The search was initialized from two manually marked points, which corresponded to the upper end plate of L1 and the lower end plate of L4, and the ASM search was performed. The search process with a vertebral model including fractured vertebrae is shown in Figure 6.



View larger version (143K):
[in this window]
[in a new window]
 
Figure 6a. ASM search in fractured lumbar vertebrae, initialized with two manually marked points. (a) Lumbar DXA scan, with end plate fractures of L1–L4 vertebrae. (b) Starting ASM search position, initialized with two points. (c) Final ASM search position, after convergence.

 


View larger version (124K):
[in this window]
[in a new window]
 
Figure 6b. ASM search in fractured lumbar vertebrae, initialized with two manually marked points. (a) Lumbar DXA scan, with end plate fractures of L1–L4 vertebrae. (b) Starting ASM search position, initialized with two points. (c) Final ASM search position, after convergence.

 


View larger version (124K):
[in this window]
[in a new window]
 
Figure 6c. ASM search in fractured lumbar vertebrae, initialized with two manually marked points. (a) Lumbar DXA scan, with end plate fractures of L1–L4 vertebrae. (b) Starting ASM search position, initialized with two points. (c) Final ASM search position, after convergence.

 
To measure the accuracy of automatic vertebral shape measurement on these scans of lumbar fractures, experiments were performed that were similar to those for the normal scans. The ASM was trained with 32 scans that contained vertebral fractures and 70 normal scans. Search results are presented for the 32 scans that contained fractures. Fewer scans with fractures were available than were normal scans, resulting in the ASM of fractured vertebrae being less well trained than that of the normal vertebrae.

Detection of Vertebral Fractures
To evaluate the ability to correctly classify normal and fractured vertebrae with different shape descriptions, we investigated whether vertebral fractures could be detected better with the full vertebral contour than with vertebral heights alone and whether all shape information available with manual analysis is currently being used effectively.

Three different shape descriptions were used to classify vertebrae. The first description comprised the anterior, middle, and posterior vertebral heights as obtained with conventional manual six-point morphometry. The second description could also be derived with conventional manual morphometry and comprised the actual positions of the six manually marked points, after they were adjusted for varying rotation, scaling, and translation of the vertebra (none of which affect its underlying shape).

The third vertebral shape description was x, the full vertebral contour, which described the shape of the whole vertebral body rather than only six points on the vertebra. Although acquisition of this shape description would be practical only with an ASM search, since it is too time-consuming to routinely annotate manually, the manually annotated full contour was used for these experiments. The fracture-detection performance with the manually annotated full contour would reflect the potential performance of ASM-based fracture detection.

A form of statistical classifier known as a Mahalanobis distance classifier (12) was trained with the vertebral shape descriptions and "known" fracture status of normal and fractured L1–L4 vertebrae obtained by means of visual inspection of each DXA scan by an experienced radiologist (J.E.A.). The classifier modeled the shape distributions of normal and fractured vertebrae during training as a multivariate gaussian distribution. The classifier could then be used to classify new vertebrae as either normal or fractured on the basis of the shape description, depending on the proximity of the new shape description to the distributions of shapes known (from training) to be fractured or normal. The Mahalanobis distance D, a measure of this proximity, describes the log of the probability of a vector v = {v1, v2, . . ., vt} (in this case representing vertebral shape) from a known Gaussian distribution (which describes the range of normal or fractured vertebral shapes). The mean is = {1, 2, . . ., t} and variance is {lambda}i in the ith parameter. This is given by the following equation:

To decide whether a vertebra was fractured or normal, its shape description was compared with the mean of the fractured and normal groups. The vertebra was then assigned to the group with findings that appeared nearer in terms of Mahalanobis distance. Classification with the Mahalanobis distance was an attempt to mimic the judgment of fracture status by the radiologist.

The specificity of the classifier in not mistaking normal vertebrae as fractured was measured for a range of sensitivities to fracture. When the classifier was very sensitive, all fractures were detected, but many normal vertebrae were wrongly classified as fractured. When the classifier was made less sensitive, no normal vertebrae were misclassified as fractured, but many fractured vertebrae were not detected. A receiver operating characteristic (ROC) curve described the performance of the classifier in distinguishing fractured from normal vertebrae by plotting sensitivity against the quantity (1 - specificity). The area under the ROC curve (1 for a perfect classifier, 0.5 for an ineffective classifier) was also calculated as an indicator of fracture classification performance.

The performance of the classifier with each of the three shape descriptions was tested on a leave-n-out basis. The classifier was trained with all training examples except n and was tested on the remaining n examples. For these experiments, n was chosen as 4. This train-test process was repeated, with each n example left out of the training set in turn, ensuring that every example was treated as unseen by the classifier. The training and test data consisted of scans of 128 L1–L4 vertebrae obtained in patients with vertebral fractures and a similar 128 scans obtained in subjects with normal vertebrae. Although the L1–L4 lumbar vertebrae differ slightly in shape, they were treated as equivalent for these experiments to maximize the training data available for the classifier. The variance in the measurement of the area under the ROC curve was also estimated by performing the above leave-n-out experiment on many subsets of the training data with n further examples left out.


    Results
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 References
 
Table 1 shows that for normal vertebrae, a vertebral search with the automatic ASM was as accurate as that by manual operators. For the T7 and L4 vertebrae, an ASM search was slightly worse because there were no adjacent vertebrae on both sides in the shape model to help constrain their likely location. Precision was good, implying that the search solution was independent of the manual placement of search initialization points. The failure rate was low. The performance of the automatic method suggests that it could provide a practical and improved alternative to manual methods for performing vertebral morphometry.


View this table:
[in this window]
[in a new window]
 
TABLE 1. Manual and ASM Accuracy for Each Vertebral Level in Subjects with Normal Vertebrae
 
The errors in location of fractured lumbar vertebrae with the ASM are shown in Table 2. These errors were of approximately twice the magnitude as those for the normal lumbar vertebrae, and the search failure rate was higher.


View this table:
[in this window]
[in a new window]
 
TABLE 2. ASM Accuracy for Each Vertebral Level in Patients with Vertebral Fractures
 
ROC curves that describe the ability of a Mahalanobis distance classifier to distinguish fractured from normal vertebrae are shown in Figure 7. Table 3 lists the mean area under the ROC curve with each method. Comparisons between the areas under ROC curves were made with the method of Beck and Shultz (13). The full vertebral shape description was found to be marginally more effective than was the vertebral height description (P = .03) for distinguishing fractured from normal vertebrae. Use of six points to describe vertebral shape was not significantly worse than use of the full shape (P = .56) and was barely better than use of vertebral heights (P = .13). Given the small sample size and the small proportion of misclassified vertebrae for all methods, the low statistical significance for differences between methods is not surprising.



View larger version (16K):
[in this window]
[in a new window]
 
Figure 7. ROC curves represent the distinguishability of normal from fractured vertebrae with three descriptions of vertebral shape.

 

View this table:
[in this window]
[in a new window]
 
TABLE 3. Area under the ROC Curve for Detection of Vertebral Fractures with Three Methods
 

    Discussion
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 References
 
The development of quantitative methods for vertebral morphometry has been motivated by the need for rapid, reproducible, and accurate measurement of vertebral shape to reliably detect vertebral fractures (14). Most quantitative methods involve measuring of anterior, middle, and posterior vertebral heights as derived with six (or more) points marked manually on the vertebral edges. These points can be marked on a radiograph, in which case vertebral heights must be measured with a ruler. Alternatively, the point positions can be entered with a digitizing tablet (15), or the radiograph itself can be digitized into a computer image and the points can be marked manually onto the image with a mouse (16). The computer then automatically calculates the vertebral heights on the basis of the point positions. Such methods are faster and more reliable than manual measurement of vertebral heights, although manual marking of the points is itself a subjective process that is open to different interpretations.

ASMs are a further advance toward the automation of vertebral morphometry. We found that ASMs are capable of robust automatic location and measurement of the shape of normal vertebrae, and, with virtually no user interaction, results are as accurate as those with human operators. We performed initial evaluation of the method with DXA images obtained of osteoporotic vertebral fractures. Although the performance of the technique in this group was not as good as that in the normal group, we believe this is a result of the availability of an insufficient number of vertebral fractures to train the model well (17,18). The accuracy for automatically locating normal vertebrae is similar to the reproducibility obtained by Steiger et al (17) for manual analysis of morphometric scans. They found the measurement of anterior and posterior vertebral heights to be precise to approximately 1 mm.

The ASM technique is governed by the range of vertebral shapes and appearances contained within the set of images used to train the model. Any vertebra with a shape similar to those in the training set should be easily located with the ASM. The performance will be noticeably reduced only if a particular vertebral shape lies well outside the range of those observed in training (for example, if the training set contained only normal spines and a search is performed to locate a vertebra with a severe wedge fracture). This means that an ASM trained with images obtained in one race and used to locate vertebrae in patients in another race should perform well, as the variations between races will be small in comparison to the full range of variability in vertebral shape.

The analysis of points marked on the vertebral edges has been the subject of much study (11,14,15,1923). Different approaches vary greatly in complexity, effectiveness, and the way in which each has been tested. When the performance in the detection of fractures is compared for two methods, it is necessary to compare the sensitivities and specificities over a range of sensitivities for detecting fractures.

In most studies, only one pair of sensitivity and specificity values is reported, which makes comparison between methods difficult. This reflects a failure to appreciate the fact that fracture is a continuous phenomenon, and that the best threshold for defining fractures depends on many factors, such as the consequences of vertebral fractures, the cost of treatment, and the reliability of measurement.

Smith-Bindman et al (11) used the radiologist-agreed semiquantitative grades of Genant et al (4) as "ground truth" in the comparison of classifications of fractures based on six points. They compared the effectiveness of placing thresholds on anterior, middle, and posterior vertebral heights to detect fractures by means of more complex measures with combinations of vertebral heights. Methods that first normalize for variations in mean vertebral dimension (some individuals have larger vertebrae than others) and those that compare heights with those of a neighboring vertebra were found to be more effective than were methods based on simple thresholds of vertebral height. Typical values for sensitivity (proportion of fractures successfully detected) and specificity (nonfractures marked as nonfractures) for the best methods employing such normalization are 90% and 87%, respectively. Other methods that attempt to normalize for variations between individuals and between vertebrae—such as those of Black et al (24), Ross et al (21), and McCloskey et al (15)—have generally been found to be superior to simpler techniques—such as that of Melton et al (19)—that place simple thresholds on vertebral heights.

Such comparisons of fracture detection algorithms can be performed either by using semiquantitative gradings with a method such as that of Genant et al (4) as "ground truth" or by performing serial radiography and accepting as "true" only fractures that are confirmed later. Although serial radiographic findings are less biased, their acquisition may not always be practical, in which case use of a radiologist's grading as "ground truth" is a reasonable alternative. In our experiments, the radiologist's grading was treated as truth.

Recently, artificial neural network (ANN) classifiers have been used to relate vertebral heights from six measured points to the "known" degree of fracture as determined with a semiquantitative technique (23). Results with the ANN method were compared to those with the method of Black et al (24), and the ANN was found to possibly improve sensitivity to fractures, although the Black method was better at lower sensitivities. Typical values for sensitivity and specificity, respectively, are 87% and 88% for the ANN classifiers and 80% and 92% for the Black method.

We found that a more complete description of vertebral shape, as is obtained with an ASM, may contain much more information for detecting vertebral fractures than is possible with current manual analysis. By measuring vertebral heights with six-point manual analysis, valuable information is discarded about the overall vertebral shape, which may result in poorer fracture detection. If the positions of all six points are used to classify fractures, then performance may be improved. Most current methods for detecting fractures are based on analysis of vertebral heights; these methods contain less information than is available with the manually annotated points.

To our knowledge, few studies have been performed to evaluate the power of DXA morphometry compared with that of conventional radiography (17,18). Since conventional radiography is still preferred in studies of vertebral morphometry, it would be useful if our approach could be extended to work with digitized radiographs. With conventional radiography, the whole spine cannot be depicted on a single image. It is necessary to "patch together" overlapping images to depict the whole spine. The technique we describe could be extended to locate and measure vertebrae on each radiograph separately, and then the results for each radiograph could be combined to analyze the whole spine.

Because conventional radiography gives rise to projection effects, the vertebral views will not remain consistently lateral and will vary through the radiograph. This behavior is independent of inherent vertebral shape and introduces additional variability in vertebral appearance, which could lead to a requirement for more training examples to adequately model this additional variability. In a feasibility study for this work, ASMs were found to be effective in the analysis of digitized radiographs of the lumbar spine (25).

The need for an automatic system for morphometry in large studies led to the development of a method that was used in the European Vertebral Osteoporosis Study (26). In this method, very simple edge detection algorithms were used that were easily distracted by lung tissue and any degradation in image quality, and imperfectly positioned lateral vertebral views could not be used. The reproducibility was worse than that with the manual placement technique, and only 60% of radiographs could be processed with minimal interaction.

We applied our automatic method to DXA images, which are poorer in quality and resolution than radiographs. Accuracy was as good as that with manual methods, and the failure rate was lower than that with the method used in the European Vertebral Osteoporosis Study. Although our technique does not yet represent a statistically significant improvement in performance over that with manual methods, it can be performed more easily and rapidly and is very precise. These practical advantages could make it a valuable tool for use in other large multicenter vertebral osteoporosis studies.

We are currently extending the ASM technique to use with full spine DXA scans obtained in a large cohort of patients with vertebral fractures on a DXA scanner with an upgraded fan beam, which produces images with higher resolution and less noise and acquires single-energy scans. We intend to compare the performance of our technique on both dual- and single-energy images and to continue our comparison of manual and automatic techniques in this larger cohort.

When vertebral shape was represented as a full contour rather than as only vertebral heights, as in current morphometric methods, the ability of a statistical classifier to distinguish fractured from normal vertebrae was improved. This illustrates the potential for an ASM to be used to detect fractures. When vertebral shape was represented by means of the positions of six points on the outline, results were also slightly better than those based on vertebral heights.

The spine ASM, a robust tool for measuring vertebral shape on normal spine DXA scans, automatically extracts more shape information both accurately and more rapidly than is practical with manual analysis.


    Acknowledgments
 
The authors thank Ian Smith, MD, of Synexus, Chorley, England, for the normal DXA scans, and Peter Steiger, PhD, of Hologic for the DXA scans of patients with osteoporosis. The authors also thank Steve Capener, Karen Davies, MSc, and Yvon Watson for manually annotating the DXA scans.


    Footnotes
 
Abbreviations: ANN = artificial neural network ASM = active shape model DXA = dual x-ray absorptiometry PDM = point distribution model ROC = receiver operating characteristic

Author contributions: Guarantors of integrity of entire study, all authors; study concepts, P.P.S., J.E.A.; study design, all authors; definition of intellectual content, all authors; literature research, P.P.S.; clinical studies, J.E.A.; experimental studies, all authors; data acquisition, P.P.S.; statistical analysis, P.P.S., C.J.T.; manuscript preparation, editing, and review, all authors.


    References
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 References
 

  1. Cummings SR, Kelsey JL, Nevitt MC, O'Dowd KJ. Epidemiology of osteoporosis and osteoporotic fractures. Epidemiol Rev 1985; 7:178-208.[Free Full Text]
  2. Holbrook T, Grazier K, Kelsey J, Stauffer R. The frequency of occurrence, impact and the cost of musculo-skeletal conditions in the United States Chicago, Ill: American Academy of Orthopedic Surgeons, 1985.
  3. Riggs BL, Melton LJ, eds. Osteoporosis: etiology, diagnosis and management New York, NY: Raven, 1988.
  4. Genant H, Wu C, Kuijk CC, Nevitt M. Vertebral fracture assessment using a semiquantitative technique. J Bone Miner Res 1993; 8:1137-1148.[Medline]
  5. Cootes TF, Hill A, Taylor CJ, Haslam J. The use of active shape models for locating structures in medical images. Image Vision Comput 1994; 6:276-285.
  6. Cootes TF, Taylor CJ, Cooper DH, Graham J. Active shape models: their training and application. Comput Vision Image Understanding 1995; 1:38-59.
  7. Hill A, Thornham A, Taylor CJ. Model-based interpretation of 3D medical images. In: Illingworth J, eds. Fourth British Machine Vision Conference. Malvern, England: British Machine Vision Association, 1993; 339-348.
  8. Manly B. Multivariate statistical methods: a primer London, England: Chapman & Hall, 1986.
  9. Cootes TF, Taylor C, Lanitis A. Active shape models: evaluation of a multi-resolution method for improving image search. In: Hancock E, eds. Fifth British Machine Vision Conference. Malvern, England: British Machine Vision Association, 1994; 327-336.
  10. Efron B. The jackknife, the bootstrap and other resampling plans Philadelphia, Pa: The Society for Industrial and Applied Mathematics, 1982.
  11. Smith-Bindman R, Steiger P, Cummings S, Genant H. A comparison of morphometric definitions of vertebral fracture. J Bone Miner Res 1991; 6:25-34.[Medline]
  12. Ripley B. Pattern recognition and neural networks Cambridge, Mass: Cambridge University Press, 1996.
  13. Beck JR, Shultz EK. The use of relative operating characteristic (ROC) curves in test performance evaluation. Arch Pathol Lab Med 1986; 110:13-20.[Medline]
  14. Minne H, Leidig G, Wüster C, et al. A newly developed spine deformity index (SDI) to quantitate vertebral crush fractures in patients with osteoporosis. Bone Miner 1988; 3:335-349.[Medline]
  15. McCloskey E, Spector T, Eyres K, et al. The assessment of vertebral deformity: a method for use in population studies and clinical trials. Osteoporosis Int 1993; 3:138-147.[Medline]
  16. Redei J, Countryman P, Genant H. Computer-assisted morphometry of vertebral fractures. In: Genant H, Jergas M, Kuijk CV, eds. Vertebral fracture in osteoporosis. Berkeley, Calif: University of California Press, 1995; 293-308.
  17. Steiger P, Cummings S, Genant H, Weiss H, Study of Osteoporotic Fracture Research Group. Morphometric x-ray absorptiometry of the spine: correlation in vivo with morphometric radiography. Osteoporos Int 1994; 4:238-244.[Medline]
  18. Blake GM, Rea JA, Fogelman I. Vertebral morphometry studies using dual-energy x-ray absorptiometry. Semin Nucl Med 1997; 27:276-290.[Medline]
  19. Melton L, Kan S, Frye M, Wahner H, O'Fallon W, Riggs B. Epidemiology of vertebral fractures in women. Am J Epidemiol 1989; 129:1000-1011.[Abstract/Free Full Text]
  20. Eastell R, Cedel S, Wahner H, Riggs B, Melton L. Classification of vertebral fractures. J Bone Miner Res 1991; 6:207-215.[Medline]
  21. Ross P, Yhee Y, Davis J, Kaminoto C, Epstein R, Wasnich R. A new method for vertebral fracture diagnosis. J Bone Miner Res 1993; 8:167-174.[Medline]
  22. Davies K, Recker R, Heaney R. Revisable criteria for vertebral deformity. Osteoporos Int 1993; 3:265-270.[Medline]
  23. Redei J, Countryman P, Genant H. Development of a neural network–based computer aided diagnosis system for a vertebral morphometry workstation (abstr). Osteoporos Int 1996; 19(3S):159.
  24. Black DM, Cummings SR, Stone K, Hudes E, Palermo L, Steiger P. A new approach to defining normal vertebral dimensions. J Bone Miner Res 1991; 6:883-892.[Medline]
  25. Lindley K. Model-based interpretation of lumbar radiographs. Thesis England: University of Manchester, 1992.
  26. Felsenberg D, Kalender W. Computer-assisted morphometry of vertebral fractures. In: Genant H, Jergas M, Kuijk CV, eds. Vertebral fracture in osteoporosis. Berkeley, Calif: University of California Press, 1995; 309-317.



This article has been cited by other articles:


Home page
RadiologyHome page
K. L. Weiss, J. M. Storrs, and R. B. Banto
Automated Spine Survey Iterative Scan Technique
Radiology, April 1, 2006; 239(1): 255 - 262.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Figures Only
Right arrow Full Text (PDF)
Right arrow Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when eLetters are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Smyth, P. P.
Right arrow Articles by Adams, J. E.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Smyth, P. P.
Right arrow Articles by Adams, J. E.


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
RADIOLOGY RADIOGRAPHICS RSNA JOURNALS ONLINE