Introduction

Osteoporosis is a chronic skeletal disease characterized by reduced bone mass and deteriorated bone microarchitecture, which results in an increased risk of fractures. It is estimated that about 50% of White women and 20% of men will experience an osteoporosis-related fracture at some point in their lifetime.1 Osteoporotic fractures may lead to disability, impaired quality of life, and increased mortality, and they are a tremendous burden, both personal and economic.2-4 The total health burden, measured in terms of quality-adjusted life year (QALY) loss, was estimated at 1 million lost QALYs in 2017 for the 5 largest countries of the European Union and Sweden (EU6).3 Disability-adjusted life years caused by osteoporotic fractures were outranged only by lung cancer, dementia, and ischemic heart disease. A total of 2.7 million fractures that caused this health burden also attributed to a direct cost of €37.5 billion. In Poland, about 120 000 patients experience an osteoporotic fracture annually, with only direct costs amounting to PLN 473 million (approximately €104 million).5 The annual mortality rate after fracture of the proximal femur in Poland was 29.4% in 2017.5 The number of years of life lost that can be directly attributed to these fractures is about 20 000.

Unfortunately, osteoporosis is an underdiagnosed disease.3,5,6 In the EU6 countries, 73% of women and 63% of men eligible for osteoporosis treatment do not receive it.3 In Poland, it is estimated that 74% of patients with osteoporosis remain undiagnosed—of 2.1 million Poles with this disease only 0.55 million are diagnosed.5 The assessment of bone mineral density (BMD) alone is insufficient for identification of patients at risk of fracture (novel techniques are used to also assess bone microarchitecture,7 and the key element is the evaluation of the individual fracture risk with questionnaires, such as country-specific Fracture Risk Assessment Tool [FRAX]); however, dual-energy X-ray absorptiometry (DXA) of the proximal femur and lumbar spine remains the gold standard technique for osteoporosis diagnosis per definition.3,4,6,8 To improve the recognition of osteoporosis, better access to DXA is needed.4,6 Still, accessibility to DXA examinations in Poland is highly limited. The availability of DXA units in Poland is one of the worst in Europe (fifth from the bottom out of 27 European countries in 2010).4 In 2017, this examination was performed only in 176 000 patients.5

For these reasons, the search for novel diagnostic techniques continues.6 Ultrasound methods may be of particular interest because they involve nonionizing techniques, and the devices are portable and relatively cheap9 (although the price of an ultrasound device is similar to that of a DXA device, it is characterized by lower utilization costs). Techniques of bone density assessment by quantitative ultrasound (QUS) have been studied for many years and a meta-analysis of these studies showed the value of QUS in predicting fractures.10 Still, QUS has not yet gained widespread clinical use and its role in diagnosing osteoporosis remains undetermined. An equally important problem is the fact that DXA requires technical and interpretive excellence, scrupulous adherence to equipment calibration protocols, and the assessment of measurement precision.4,6,11 Failure to adhere to strict DXA measurement principles can lead to a high percentage of erroneous BMD reports in everyday clinical practice.12,13 Radiofrequency echographic multispectrometry (REMS), an innovative quantitative ultrasound technique for the diagnosis of osteoporosis in the proximal femur and lumbar spine, has been recently validated in a few clinical studies, showing promising results.14-16 It was approved by the Food and Drug Administration in October 2018 and was declared a valuable tool for osteoporosis diagnosis and fracture risk prediction by the European Society for Clinical and Economic Aspects of Osteoporosis, Osteoarthritis and Musculoskeletal Diseases (ESCEO).17

The aim of our study was to evaluate the diagnostic agreement between REMS and DXA for the assessment of bone density in the proximal femur and lumbar spine in a Polish group of patients.

Patients and methods

The study was conducted at the Department of Radiology in the National Institute of Geriatrics, Rheumatology, and Rehabilitation in Warsaw, Poland. The inclusion criteria were White ethnicity, age between 40 and 87 years, and indications for femoral and lumbar spine DXA (age >65 years in women and >70 years in men or postmenopausal age in women and >50 years in men with risk factors for fracture, history of low-trauma fracture of the spine or hip in individuals aged >50 years, and risk of secondary osteoporosis regardless of age). The exclusion criterion was a significant skeletal impairment not allowing proper execution of DXA and / or REMS. All recruited patients underwent DXA and REMS of the proximal femur and lumbar spine. The study protocol has been approved by the hospital bioethics committee (no. KBT-7/1/2017). All participants (n = 116; 98 women and 18 men) signed informed consent for inclusion in the analysis. The study was conducted according to the Declaration of Helsinki.

Both DXA and REMS reports included the BMD value, expressed as grams per square centimeter (g/cm2), and T-score values. According to the World Health Organization definitions,18 osteoporosis was diagnosed in patients with a T-score not exceeding –2.5 SD, osteopenia in those with a T-score greater than –2.5 and lower than –1.0 SD, whereas individuals with a T-score of –1.0 SD or greater were considered healthy. All DXA and REMS reports were reviewed for possible errors by 2 independent experts (according to the methodology described by Di Paola et al),14 and only error-free reports were included in the statistical analysis. Errors were categorized as those related to data analysis, patient positioning, artifacts, or incorrect personal data entry.

The DXA scans were performed using a Discovery A densitometer (Hologic, Marlborough, Massachusetts, United States), whereas the REMS scans were carried out using EchoStation with the dedicated EchoStudio software (Echolight Spa, Lecce, Italy). The latter device is equipped with a 3.5-MHz broadband convex ultrasound transducer and configured to provide both echographic images and “raw,” unfiltered radiofrequency signals, sampled at 40 MS/s. During examination of the proximal femur, the probe is placed in the hip area, and when the lumbar spine is examined, it is placed on the abdominal wall (initially on the xiphoid process of the sternum) and moved centrally down, with automated identification of the regions of interest. Examination of the spine lasts approximately 80 s and that of the proximal femur about 40 s; the progress of the test is displayed on the screen and signaled by sounds. All scans are performed in a supine position. The acquired ultrasonographic data are processed by an automatic algorithm that performs a series of spectral and statistical analyses. The analysis of both the echographic images (Figure 1A) and native radiofrequency signals allows for the calculation of BMD (Figure 1B). T-score and Z-score for REMS results are calculated using the National Health and Nutrition Examination Survey reference database19 (as in the case of Hologic DXA devices).

Figure 1 A Ultrasound image obtained during radiofrequency echographic multispectrometry examination of the lumbar spine

Statistical analysis

The diagnostic agreement between the 2 methods was assessed by calculating the diagnostic agreement percent and Cohen κ for the DXA and REMS results, with separate analyses for the proximal femur and lumbar spine scans. Additionally, a DXA-REMS correlation coefficient and Bland–Altman plots (of differences between DXA and REMS measurements plotted against the averages of the 2 measurements) were obtained as supportive methods for comparing the results of REMS and DXA. The normality of data distribution was assessed using the Shapiro–Wilk test. The correlation was determined using Spearman rank correlation coefficient due to the nonparametric nature of the variables. Additional sub-analyses of the impact of sex, age, and BMI were performed—the significance of the observed differences between correlations in different groups was measured using Fisher Z-transformation. The level of significance was set at P value of less than 0.05. Statistical analysis was performed using Statistica software, version 13.1 (StatSoft, Tulsa, Oklahoma, United States).

Results

After the exclusion of patients due to significant skeletal impairments, missing results, and erroneous reports, 66 reports of the proximal femur and 58 reports of the lumbar spine were included in the final analysis (Figure 2). Only 23.4% of all erroneous reports in the femur group and 20% in the lumbar spine group were excluded due to REMS errors. The remaining reports were excluded due to DXA errors, primarily incorrect data analysis.

Figure 2. Flow chart for patient enrolment and data validation

Abbreviations: DXA, dual-energy X-ray absorptiometry, L-S, lumbosacral, others, see Figure 1

Characteristics of patients included in the statistical analysis are shown in Table 1 (lumbar spine group) and Table 2 (femur group). The diagnostic agreement between DXA and REMS results (patients diagnosed as healthy, osteopenic, or osteoporotic) was 82.8% (Cohen κ = 0.611) in the lumbar spine group and 84.8% (Cohen κ = 0.667) in the femur group. Strong correlations between REMS and DXA results (BMD and T-scores) were found, both in the lumbar spine and femur groups (r and P values are presented in Tables 1 and 2 and in Supplementary material, Figure S1A–S1D). Scatter diagrams of the differences between DXA and REMS measurements plotted against the averages of the 2 measurements are presented in Bland–Altman plots, Figure 3A and 3B (lumbar spine group) and Figure 3C and 3D (femur group).

Table 1. Characteristics of the lumbar spine group with error-free scans

Variable

Value

Female sex, n (%)

53 (91.4)

Age, median (min–max)

61 (40–87)

BMI, median (min–max)

25.95 (18.2–42.6)

Diagnosis, n (%)

Osteoporosis

DXA

7 (12.1)

REMS

4 (6.9)

Osteopenia

DXA

33 (56.9)

REMS

30 (51.7)

Healthy

DXA

18 (31)

REMS

24 (41.4)

Diagnostic agreement, %

82.8

Cohen κ

0.611

BMD, g/cm2, median (min–max)

DXA

0.873 (0.642–1.300)

REMS

0.914 (0.673–1.107)

Spearman correlation

r = 0.839; P <⁠0.001

T-score, median (min–max)

DXA

–1.6 (–3.7 to 2.3)

REMS

–1.2 (–3.4 to 0.2)

Spearman correlation

r = 0.846; P <⁠0.001

Abbreviations: BMI, body mass index; others, see Figures 1 and 2

Table 2. Characteristics of the femur group with error-free scans

Variable

Value

Female sex, n (%)

53 (80.3)

Age, median (min–max)

62 (40–85)

BMI, median (min–max)

26.65 (19.4–36.6)

Diagnosis, n (%)

Osteoporosis

DXA

3 (4.6)

REMS

4 (6.1)

Osteopenia

DXA

32 (48.5)

REMS

35 (53)

Healthy

DXA

31 (46.9)

REMS

27 (40.9)

Diagnostic agreement, %

84.8

Cohen κ

0.667

BMD, g/cm2, median (min–max)

DXA

0.748 (0.455–1.151)

REMS

0.720 (0.500–1.053)

Spearman correlation

r = 0.867;

P <⁠0.001

T-score, median (min–max)

DXA

–1.1 (–3.6 to 2.7)

REMS

–1.15 (–3.1 to 1)

Spearman correlation

r = 0.871;

P <⁠0.001

Abbreviations: see Figures 1 and 2 and Table 1

Figure 3. Bland–Altman plots of the differences between dual-energy X-ray absorptiometry and radiofrequency echographic multispectrometry measurements of bone mineral density (BMD) plotted against the averages of the 2 measurements in the lumbar spine group (A) and the femur group (C); and of T-scores plotted against the averages of the 2 measurements in the lumbar spine group (B) and the femur group (D)

The effects of sex, age, and BMI on the obtained results were studied in detail and are reported in Table 3 (lumbar spine group) and Table 4 (femur group). The correlations between REMS and DXA scores remained strong, both in the young and the elderly (age <⁠60 and ≥60 years, respectively), as well as in patients with normal body weight and the overweight ones (BMI <⁠25 kg/m2 and ≥25 kg/m2, respectively). Spearman coefficient was not calculated for the group of men with lumbar spine scans due to lack of power (too small group, 5 results) but remained significant in men with femur scans. No significant differences were observed between the correlations in all subgroups.

Table 3. Sub-analysis of the lumbar spine group

Variable

Diagnostic agreement, %

Spearman correlation

BMD

T-score

Sex

Male (n = 5)

100

Analysis not performed due to lack of power

Female (n = 53)

79.2

Analysis not performed due to lack of power

Age, y

<⁠60 (n = 25)

80

r = 0.781;

P <⁠0.001

r = 0.791; P <⁠0.001

≥60 (n = 33)

81.8

= 0.834;

P <⁠0.001

r = 0.848; P <⁠0.001

Difference

P = 0.293

P = 0.267

BMI, kg/m2

<⁠25 (n = 23)

87

r = 0.858;

P <⁠0.001

r = 0.869; P <⁠0.001

≥25 (n = 35)

77.1

r = 0.776;

P <⁠0.001

r = 0.784; P <⁠0.001

Difference

P = 0.19

P = 0.169

Significance was tested with Fisher Z-transformation.

Abbreviations: see Figure 1 and Table 1

Table 4. Sub-analysis of the femur group

Variable

Diagnostic agreement, %

Spearman correlation

BMD

T-score

Sex

Male (n = 13)

76.9

r = 0.714; P = 0.006

= 0.765; P = 0.002

Female (n = 53)

84.9

r = 0.854; P<⁠0.001

r = 0.862; <⁠0.001

Difference

P = 0.139

P = 0.199

Age, y

<⁠60 (n = 22)

81.8

r = 0.778;

<⁠0.001

= 0.818; <⁠0.001

≥60 (n = 44)

84.1

r = 0.88;

<⁠0.001

r = 0.877; <⁠0.001

Difference

P = 0.113

P = 0.223

BMI, kg/m2

<⁠25 (n=23)

78.3

= 0.789;

<⁠0.001

= 0.718; <⁠0.001

≥25 (n=43)

86

= 0.848;

<⁠0.001

= 0.846; <⁠0.001

Difference

P = 0.255

P = 0.108

Significance was tested with Fisher Z-transformation.

Abbreviations: see Figure 1 and Table 1

Discussion

We found a significant diagnostic agreement between REMS and DXA measurements in all patients, irrespective of the sex, age, and BMI. In our study, the diagnostic concordance was 84.8% for the femoral neck and 82.8% for the lumbar spine. These results are in line with those of 2 large multi-center studies comparing DXA and REMS. In the Italian study involving 1914 postmenopausal women, the diagnostic agreement between DXA and REMS was 88.2% for the femoral neck and 88.8% for the lumbar spine.14 In the most recent and largest international study to date, including 4307 women aged 30 to 90 years, the diagnostic concordance was 86% for the femoral neck and 86.6% for the lumbar spine.16 Despite these promising results, the accuracy of REMS should be treated with caution. We found 7 patients in the lumbar group who had osteopenia based on DXA, whereas according to the results of REMS they should be considered healthy.

In our study, all DXA and REMS reports were carefully checked for possible errors. Our observation shows that automatic algorithm used in REMS, by exclusion of nondiagnostic scans, helps to eliminate most of the technical errors typical of DXA (like wrong positioning of the patient, or erroneous data analysis), which is in accordance with the results of a previous study by Messina et al.13 While DXA remains the gold standard for the assessment of bone density, it requires excellent technique to avoid erroneous results.4,6 According to a study by Krueger et al,12 technical errors can be identified in 90% of DXA scans. Similar results were reported in the study by Messina et al13—more than 90% of DXA reports include 1 or more errors, mostly related to wrong data analysis or patient positioning.13 In contrast, proper technique of REMS is quite easy to master after a short training as the operator is required to set only 2 parameters during the examination: transducer depth and focus. The advantage of REMS in terms of automatic elimination of erroneous reports can be useful especially in the evaluation of lumbar spine scans. Structures like osteophytes, calcifications (eg, atherosclerotic plaques in the abdominal aorta), or compression fractures may result in a false increment of BMD, causing false automatic DXA reports. In standard DXA, it is essential to thoroughly analyze each vertebra and a minimum of 2 vertebrae should be assessed to obtain reliable results according to the International Society for Clinical Densitometry guidelines.11 The same principle applies to REMS, but it is done automatically. Indeed, in our study, the Bland–Altman plot for the T-score in the lumbar region shows the increasing difference between the measurements for higher T-scores, suggesting falsely elevated DXA results in the presence of degenerative lesions. Errors related to wrong patient positioning can affect both spinal and femoral DXA results. Usually, the spine is not centered or straight in the image field, and the femur is adducted / abducted and in an inadequate internal rotation. The automatic algorhithm used in REMS eliminates patient positioning errors through selective analysis of the trabecular bone (by comparison of spectral features of the tested area with the spectral model of the trabecular bone)20—if the received image is inadequate, the result cannot obtained and the examination should be repeated. These advantages of REMS have been recognized by the ESCEO.17

Our experience shows that to obtain a good image, the REMS examination of the spine has to be done after fasting, like in the case of abdominal ultrasound (intestinal gas can interfere with imaging of the vertebrae). Additionally, the usefulness of REMS is limited in obese patients—the maximal distance separating the bone surface from the ultrasound probe (the “depth” regulated by the operator) should be 210 mm for the spine and 150 mm for the femoral neck. In a previous study, it was shown that diagnostic accuracy between REMS and DXA can be slightly lower in elderly patients (aged >65 years) with BMI higher than 25 kg/m2 than in younger patients (69.6% vs 81.5%, respectively).19 Still, this result may be due to degenerative lesions in the spine and the associated false-negative DXA results—in a study in which DXA and REMS reports were evaluated for possible errors, REMS was assessed as feasible for all the patients without extreme obesity (BMI up to 40 kg/m2).14 Similarly, in our study, there was no difference in correlations between the results of DXA and REMS spine scans of patients with normal body weight and the overweight ones.

The observed correlations between DXA and REMS results found in our study provide additional evidence supporting the opinion about the usefulness of REMS as an alternative to classic densitometry in BMD imaging. One of the strengths of our study is the validation of the obtained results (exclusion of reports with errors). Moreover, it was a real-life study, and patients were not specifically selected for this short investigation. The biggest limitation is a small sample size. However, the groups were large enough given the observed differences—the statistical power was sufficient to evaluate the significance of the expected correlation coefficients higher than 0.8 (except for the subgroup of men with lumbar spine scans). Another limitation is the cross-sectional design of the study. Prospective studies that could demonstrate the usefulness of REMS in predicting osteoporotic fractures would be of greatest clinical value. To date, only a single study by Adami et al15 assessed that issue prospectively, showing promising results. Finally, we did not evaluate the intra-operator repeatability. Still, according to a previous study by di Paola et al,14 the precision error of REMS (root-mean-square coefficient of variation) was 0.32% for the femoral neck and 0.38% for the lumbar spine,14 which is a much smaller error compared with that of DXA (estimated to be 1.47%21 for the femoral neck and 1.26% for the spine)22.

Radiofrequency echographic multispectrometry is a novel densitometric method showing a significant diagnostic agreement with traditional DXA. Its accuracy is acceptable, irrespective of the age and BMI of the patients. This method could be particularly helpful in the diagnosis of osteoporosis in the elderly, as it does not render falsely overestimated values of BMD due to the omission of degenerative lesions and aortic plaques. It has the advantages of quantitative ultrasound and automatically eliminates a significant part of erroneous results; therefore, it could be a useful technique in routine clinical practice. The ESCEO has already rated REMS as a clinically available and valuable technology for the diagnosis of osteoporosis and fracture risk assessment.17 Nevertheless, further studies are required. The clinical utility of REMS in particular groups of patients, including those who should avoid ionizing radiation (children and pregnant women) and those at increased risk of developing secondary osteoporosis (with chronic steroid therapy, diabetes mellitus, rheumatic diseases, chronic kidney disease, or oncological diseases) should be examined.