Introduction

After their first venous thromboembolism (VTE), patients are at a risk for a recurrent event.1-3 Recurrence can be prevented by indefinite anticoagulant treatment, although this comes at the cost of an increased risk for bleeding.4,5 For this reason, indefinite anticoagulant treatment is only justified if the increased risk of bleeding (harm) is outweighed by the reduction in VTE recurrence risk (benefit).3 Current guidelines recommend indefinite treatment after the first VTE for patients with major persistent risk factors, such as active malignancy or antiphospholipid syndrome.6-9 For patients in whom no risk factors are present (also called unprovoked or idiopathic VTE), most guidelines suggest to continue,6-8 whereas others even recommend it,9 especially if the risk of bleeding is low. Discontinuation is recommended for patients whose VTE occurred in the presence of major transient risk factors (eg, major surgery or trauma with fractures).6-9 In the case of minor transient risk factors (eg, hormone use, confinement to bed outside hospital, long-haul flights) some guidelines recommend or suggest to discontinue,7-9 while other suggest to consider extended anticoagulant therapy.6 It is advised that the decision to extend anticoagulant treatment should involve a discussion with the patient, in which the benefits and risks of continuing or stopping should be presented.6-9

Despite the abovementioned rough recommendations, in clinical practice it remains difficult to balance the risk of VTE recurrence and major bleeding on an individual level, as many situations are not straightforward. Thereby, it remains a challenging problem whether a patient should be advised an indefinite anticoagulant treatment after the first VTE. To provide tailored treatment after the first VTE, the ultimate goal is to precisely predict the individual risk of both VTE recurrence and bleeding, and to balance these for individual patients.

In this review, we will discuss current literature on the prediction of VTE recurrence and bleeding for patients with the first VTE without malignancy (as different guidelines apply for these patients). We will summarize the why, what, which, and how of the prediction models by explaining why we need prediction models and providing some background. We will summarize the models that are currently available for the prediction of recurrent VTE and bleeding, and explain how we should proceed to implement these models in clinical practice.

Why do we need prediction models?

Traditionally, guidelines on VTE management distinguish between patients with and without (transient) risk factors at the time of their first VTE, since the risk of recurrence varies considerably between these groups. Recent meta-analyses showed a cumulative recurrence risk of 1% to 10% in the first year, and 3% to 25% within 5 years after the first VTE, depending on whether and which transient risk factors were present.1-3 The patients who had a VTE in the context of a major persisting provoking factor, such as cancer or antiphospholipid syndrome, have the highest risk of recurrence, with a recurrence rate of 15% within 1 year.3 For the patients without identifiable risk factors, the risk of recurrence after discontinuation is 10% within the first year and 25% within 5 years.1 For the patients with minor transient risk factors, the risk of recurrence is around 5% in the first year and 15% in 5 years. In the patients with major transient risk factors, the recurrence risk is the lowest, with the recurrence rate of 1% within 1 year and 3% within 5 years.3

Based on these risks, indefinite treatment is considered beneficial for the patients with major persisting provoking factors, unless the risk of bleeding is extremely high. For the groups with lower risks, the benefit of indefinite anticoagulant therapy is less clear, and still a matter of debate.

As mentioned above, the most important criterion in most guidelines is the presence or absence of transient risk factors. However, this binary choice is quite rudimentary, since a broad range of recurrence risks exists within patients with provoked or unprovoked VTE. For instance, for VTE patients with a transient risk factor, the risk of recurrence differs based on whether this risk factor is classified as major or minor.3 Likewise, within the group of patients with an unprovoked VTE, certain characteristics are associated with lower or higher risk of recurrence, for example, men with an unprovoked VTE have a 1.8-fold higher risk of recurrence than women.10 This variation in recurrence risk became apparent in a previous study of our group that showed substantial overlap between the predicted 2-year recurrence risk of patients with the first provoked and unprovoked VTE (Figure 1).11 Hence, a more precise estimation of an individual VTE recurrence risk should be pursued. Furthermore, as guidelines acknowledge, the decision on anticoagulant treatment duration should not only be based on the risk of VTE recurrence, but the risk of bleeding should also be considered.

Figure 1. Histogram of 2-year predicted risks of recurrence according to L-TRRiP model A for patients with a provoked first venous thromboembolism (VTE) (A) and patients with an unprovoked first venous thromboembolism (B). Adapted from Timp et al11

This risk of bleeding during anticoagulant therapy for VTE is substantial. A recent meta-analysis showed a cumulative incidence of major bleeding events of approximately 1.5% in the first year and 6% within 5 years in patients on extended anticoagulant therapy,4 whereas the risk of clinically relevant nonmajor bleeding is approximately 6% in the first year and 22% within 5 years.12 The risk of both types of bleeding is slightly lower for patients treated with direct oral anticoagulants (DOACs) as compared with those receiving vitamin K antagonists (VKAs).4,13 In addition, other factors, such as age, previous bleeding, and active malignancy are associated with bleeding risks.14

To improve long-term treatment decisions, several studies aimed to optimize treatment duration after the first VTE based on other factors than whether the event was provoked or unprovoked, such as D-dimer levels15 or residual thrombosis.16 However, these single-factor approaches failed to distinguish well enough between the patients at low and high risk for recurrent VTE. Therefore, a more refined approach, incorporating multiple prognostic factors in a single prediction model may have a greater potential, and for this reason several such models for recurrent VTE and bleeding have been developed in the past decade.

What are prediction models?

A prediction model is a scoring system or formula that can be used to classify a patient risk using information on several factors. When a prediction model is presented as a scoring system, such as the CHA2DS2-VASc score, a total score can be determined based on the presence or absence of predictors. Often a threshold is provided to classify patients into risk categories according to the total score. Of note, the categories give information on a relative scale (the higher the score, the higher the risk), but generally the information in absolute terms is missing. Alternatively, a model can be presented as a formula that can be used to calculate the absolute risk of an outcome at a certain time point, that is, the prediction horizon.17

Prediction models can be broadly divided into 2 categories: prognostic models and diagnostic models. The prognostic models predict the chance for a disease or outcome to occur, which can have an informative purpose or can be used to guide treatment decisions.18 Examples of prognostic models are the CHA2DS2-VASc score and the Framingham risk score. Diagnostic models predict the chance that a certain disease is present, and can be used to decide whether additional diagnostic procedures are needed. The Wells score is a well-known example of a diagnostic model.

Development of a prediction model starts with defining a research question and considering available data, candidate predictors, and the outcome of interest. To develop a valid prediction model, several methodological aspects should be considered carefully, such as handling of continuous variables, definitions of predictors, handling of missing data, the number of candidate predictors versus the number of outcome events, and the methods of statistical modelling.19 For instance, dichotomizing continuous variables might result in data loss; testing too many candidate predictors for the number of available outcome events might result in an overfitted model that does not perform well outside the development cohort, or during validation.19

Ideally, a prediction model would distinguish perfectly between patients that do and do not develop the outcome (in this setting recurrent VTE or clinically relevant bleeding). This ability and its accuracy can be expressed in the measures of discrimination and calibration.20 Discrimination refers to how well a model can differentiate between patients with and without the outcome. It is measured by the C statistic, which can be interpreted as the probability that a patient with the outcome has a higher predicted risk than a patient without the outcome. If a model is not able to discriminate between patients with and without the outcome, the C statistic is 0.5. If a model would discriminate perfectly by always assigning a higher probability to those developing the outcome than those who do not, the C statistic is 1.0.20 Generally, a model with the C statistic of 0.60 to 0.75 is considered possibly helpful and the C statistic above 0.75 is considered good discrimination.20 The accuracy of the predicted risk, that is, whether the predicted values correspond to the observed values is reflected by calibration. Calibration is assessed by comparing the predicted and observed risks at different risk categories or in different patient groups. This can be done by plotting the observed versus predicted risks, or, although less informative, by testing overall goodness of fit using the Pearson χ2 or Hosmer–Lemeshow test, in which a P value below 0.05 indicates a significant difference between the observed and predicted risks. A poorly calibrated model over- or underestimates the risk, whereas a well-calibrated model should provide good estimates of individual risk of the outcome across the range of outcome incidences.20

Furthermore, it is important that the model performance is validated during internal, and even more importantly, external validation. Upon internal validation, the stability of the model in different subsets of the development sample is assessed, whereas during external validation the validity of the model in a different population (eg, different hospital or country) is determined.17 This external validation is an essential step to decide whether a model can be applied in clinical practice.

When assessing the clinical applicability of a certain prediction model, one should examine the validity of the development methods, as well as the reported model performance, which can be done using the Prediction model Risk Of Bias Assessment Tool.21 However, adequate model development and performance do not guarantee that using the model in clinical practice will improve medical decision making or, more importantly, health outcomes of patients. For that purpose, management and implementation studies are needed, in which the added value of making treatment decisions based on the predicted risk is evaluated and barriers for implementation are identified.22

Which prediction models for venous thromboembolism recurrence and bleeding do we have?

To date, 17 models to predict VTE recurrence have been published: Men and HERDOO2, Vienna, Vienna update, DASH, DAMOVES, pre D-dimer model, post D-dimer model, Worcester VTE model (3 months and 3 years), L-TRRiP (model A, B, C, and D), AIM-SHA-RP (men and women), Continu-8, and VTE-PREDICT.11,23-33 The predictors included in these models are shown in Table 1, and the development studies and model characteristics are summarized in Table 2. Nine of these models have been externally validated at least once.11,28,33-41 These external validation studies are summarized in Table 2, and a detailed overview is included in Supplementary material, Table S1. For the prediction of bleeding in VTE patients, 15 models have been published that were solely intended for VTE patients: the score by Kuijer et al,42 Kearon et al,43 RIETE, ACCP, VTE-BLEED, EINSTEIN (before and after 3 weeks and during entire period), Hokusai, Seiler et al,50 Martinez et al,51 Alonso et al,52 PE-SARD, CHAP, and VTE-PREDICT,14,33,42-54 of which 10 were externally validated at least once.33,47-73 Furthermore, 7 models (OBRI, modified OBRI, Shireman et al, HEMORR2-HAGES, HAS-BLED, ATRIA, and ORBIT scores; Supplementary references, S42–S48) were validated in VTE patients, while having been developed for other patient groups using anticoagulant therapy, mainly for atrial fibrillation.47-50,52,54-60,63,66-68,72-76 The predictors of the bleeding risk models are summarized in Table 3, development studies and performance of models intended for VTE patients are summarized in Table 4, and a detailed overview of the external validation studies is provided in Supplementary material, Table S2. The characteristics of the models that were only validated in VTE patients are described in Supplementary material, Tables S3 and S4.

Table 1. Overview of variables included in the prediction models for recurrent venous thromboembolism

Variable

Men and HERDOO223

Vienna24

DASH25

Vienna update26

DAMOVES27

Pre D-dimer model28

Post D-dimer model28

Worcester VTE 3 years29

Worcester VTE 3 months29

L-TRRiP (model A)11, 30

L-TRRiP (model B)11, 30

L-TRRiP (model C)11, 30

L-TRRiP (model D)11, 30

AIM-SHA-RP men31

AIM-SHA-RP women31

Continu-832

VTE-PREDICT33

Clinical variables

General characteristics

Age

x

x

x

x

x

x

x

Sex

x

x

x

x

x

x

x

x

x

x

x

x

BMI / obesity

x

x

x

Characteristics of index VTE

Location of DVT

x

x

x

x

x

x

x

x

Type of VTE (PE or DVT)

x

x

x

x

x

x

x

x

x

x

x

Provoked status

x

Provoking factors

Surgery

x

x

x

x

x

x

x

xa

Plaster cast

x

x

x

x

Immobilization

x

x

x

x

xa

Hormone therapy

x

x

x

x

x

x

Pregnancy / puerperium

x

x

x

x

Trauma

x

xa

Pneumonia / sepsis

x

Varicose vein stripping

x

x

Thrombophlebitis

x

Active cancer

xa

x

Medical history / comorbidities

Cardiovascular disease

x

x

x

x

x

Previous VTE

x

History of malignancy

x

Chronic renal disease

x

Varicose veins

x

Medication use

Statins

x

Antiplatelet therapy

x

Pre-existing anticoagulant use

x

Chemotherapy

xa

Other

Post-thrombotic signs

x

IVC filter

x

x

Time between anticoagulant cessation and D-dimer measurement

x

x

Laboratory variables

D-dimer

x

x

x

x

x

x

x

x

Factor VIII

x

x

x

x

Von Willebrand factor

x

CRP

x

Factor V

x

Factor X

x

Fibrinogen

x

APC ratio

x

Genetic variables

Prothrombin G20210A

x

Factor V Leiden

x

x

x

Blood group, non-O

x

a Variables combined into 1 variable in the model

Abbreviations: APC, activated protein C; BMI, body mass index; CRP, C-reactive protein; DVT, deep vein thrombosis; IVC, inferior vena cava; PE, pulmonary embolism; VTE, venous thromboembolism

Table 2. Overview of development, internal validation, and external validation of the prediction models for recurrent venous thromboembolism

Model; author, year

Model development

Model characteristics

Internal validation

External validation

Study design and setting

Population

n (events / total)

Follow-upa

Outcome

Candidate predictors, n

Time horizon

Prediction outcome

Discriminationb

Calibration

Men and HERDOO2; Rodger et al,23 2008

Prospective cohort, 12 tertiary care centers in 4 countries; between 2001 and 2006

First unprovoked proximal DVT or PE, treated with AC for 5–7 months; exclusion criteria: VTE provoked by leg fracture, leg plaster cast, immobilization >3 days, anesthetic in the past 3 months, malignancy in the past 5 years, known high-risk thrombophilia

91/646

Mean, 1.5 years

Objectively confirmed symptomatic recurrent DVT or PE

69

Not specified

Score of 0–4; low risk: women with score ≤1; high risk: all men and women with score >1; corresponding annual recurrence rate

Not reported

Not reported

2 studies; C statistic 0.56–0.61; calibration not reported

Vienna; Eichinger et al,24 2010

Prospective cohort, 4 thrombosis centers in Austria; between 1992 and 2008

First unprovoked VTE, treated with AC for ≥3 months; exclusion criteria: VTE provoked by surgery, trauma, pregnancy, hormone use, malignancy, antithrombin, protein C or protein S deficiency, lupus anticoagulant

176/929

3.6 years

Objectively confirmed symptomatic recurrent DVT or PE

8

12 or 60 months

Nomogram of score (0–350); corresponding estimated recurrence rate

0.67 (12 months); 0.64 (60 months)

No calibration curve reported, P value lack of fit, 0.54

3 studies; C statistic 0.61–0.63; underestimation of risks in 1 study, 2 other showed reasonable correspondence between the observed and predicted risks

DASH; Tosetto et al,25 2012

Individual patient data from 5 prospective cohorts and 2 trials; Austria, Canada, Italy, Switzerland, UK, and US; published between 2006 and 2008

First unprovoked proximal DVT or PE, treated with AC for ≥3 months; exclusion criteria: VTE provoked by surgery, trauma, immobility, pregnancy and puerperium, active cancer, known antiphospholipid antibodies or antithrombin deficiency

239/1818

1.8 years

Symptomatic recurrent VTE

6

Not specified

Score of –2 to 4; low risk: score ≤1, high risk: score >1

0.71

No calibration curve reported, optimism correction factor of 0.97 suggests good overall calibration

6 studies; C statistic 0.52–0.65; calibration slope of 0.71 suggesting overfitting in 1 study, 2 studies reported reasonable correspondence between the observed and predicted risks, 3 studies did not report calibration

Vienna update; Eichinger et al,24 2014

Prospective cohort, 4 thrombosis centers in Austria; between 1992 and 2008

First unprovoked VTE, treated with AC for ≥3 months; exclusion criteria: VTE provoked by surgery, trauma, pregnancy, hormone use, malignancy, antithrombin, protein C or protein S deficiency, lupus anticoagulant

150/553

6 years

Objectively confirmed symptomatic recurrent DVT or PE

3

60 months

Nomogram of score (0–260) and corresponding estimated recurrence rate, stratified by time of prediction (3 weeks, 3, 9, or 15 months)

0.63 (3 weeks); 0.61 (3 months); 0.61 (9 months); 0.58 (15 months)

Calibration plots indicate good calibration after shrinkage, slope of 0.96 (3 weeks), 1.03 (3 months), 0.97 (9 months), and 0.94 (15 months)

2 studies; C statistic 0.39–0.58; 1 study reported P <⁠0.05 for lack of fit indicating significant difference between the observed and predicted risks, 1 study did not report calibration

DAMOVES; Moreno et al,27 2016

Prospective cohort, 2 hospitals in Spain; between 2004 and 2013

First unprovoked VTE, treated with AC for ≥3 months; exclusion criteria: VTE provoked by surgery, trauma, immobility, previous hospitalization, pregnancy, puerperium hormone use, active cancer, known strong thrombophilia

65/398

1.8 years

Objectively confirmed symptomatic recurrent DVT or PE

15

Not specified

Nomogram of score of 0–30 and corresponding annual recurrence probability; low risk: <⁠11.5 (risk <⁠5%); high risk: ≥11.5

0.91

Excellent calibration according to curve

1 study; C statistic 0.83; P = 0.125 (Hosmer–Lemeshow test)

Pre- and post D-dimer model; Ensor et al,28 2016

Individual patient data from 7 trials from Canada (RVTEC); published between 2003 and 2008

First unprovoked VTE in patients who discontinued AC; exclusion criteria: VTE provoked by surgery, lower limb trauma, pregnancy, hormone use, significant immobility, active cancer, incomplete predictor information

230/1626 (pre), 161/1200 (post)

1.8 years

Recurrent VTE

5 (pre), 7 (post)

3 years

Absolute risk of recurrence

Overall 0.56 (pre) and 0.69 (post); varying between individual studies

Varying between individual studies and prediction horizon, overall difference between the observed and expected risks at 1 year 0.0 (pre) and –0.02 (post)

1 study (pre D-dimer), post D-dimer not externally validated;

C statistic 0.56; underestimation at lower predicted risks

Worcester VTE; Huang et al,29 2016

Retrospective population-based cohort, 12 hospitals in the US; between 1999 and 2009

First VTE; exclusion criteria: upper-extremity DVT; treatment duration not considered

329/2989

2.5 years

Objectively confirmed recurrent DVT or PE

>50

3 months or 3 years

Score of 0–100 (only reported for 3-year model); divided into 4 risk categories: 0, 1–18, 19–24, ≥25

0.62 (3 years)

No calibration curve reported, P value goodness of fit 0.29–0.70 depending on risk score category, Table of the observed and expected risks suggests adequate calibration

No external validation

L-TRRiP (model A–D); Timp et al,11,30 2019

Prospective cohort (MEGA follow-up study), 4 anticoagulation clinics, the Netherlands; between 1999 and 2004

First lower-extremity DVT or PE, age 18–70 years, patients who discontinued AC; exclusion criteria: malignancy in the past 5 years

507/3750

5.7 years

Unprovoked certain recurrent DVT or PE

39

2 years

Absolute risk of recurrence

0.72 (model A), 0.71 (model B), 0.69 (models C and D)

Excellent calibration according to curve, shrinkage slope 0.953 (model C)

2 studies model C, 1 study model D, models A and B not externally validated; C statistic: 0.56–0.64 (model C), 0.65 (model D); overestimation in the highest risk quintile (model C), good calibration of model D; 1 study did not report calibration

AIM-SHA-RP; Albertsen et al,31 2020

Danish nationwide registry, between 2012 and 2017

First DVT or PE, treated with AC for <⁠18 months; exclusion criteria: Danish residents <⁠5 years, active malignancy, myeloproliferative disorder, atrial fibrillation, AC within 1 year before VTE

966/11519

Mean, 1.4 years

Primary discharge diagnosis of recurrent VTE

17

2 years

Score of –4 to 3; men: low risk: <⁠–1; intermediate risk: –1; high risk: > –1; women: low risk <⁠0; intermediate risk: 0–2, high risk: >2

0.56 (men), 0.61 (women)

Plots of the observed and predicted risks for different scores show good calibration

No external validation

Continu-8; Nagler et al,32 2021

Prospective cohort, 1 hospital, Maastricht, the Netherlands; between 2003 and 2013

First proximal DVT treated in a clinical care pathway incorporating residual vein thrombosis in decision to discontinue AC treatment; exclusion criteria: PE, malignancy

64/479

3.1 years

Objectively confirmed, symptomatic recurrent VTE

4

Not specified

Score of 0–5; low risk: 0; intermediate risk: 1–3; high risk: 4–5; corresponding recurrence rate at 5 years

0.68

Not reported

No external validation

VTE-PREDICT; De Winter et al,33 2023

Individual patient data from 3 trials (Hokusai VTE, RE-MEDY, RE-SONATE) and 2 cohort studies (Bleeding Risk Study, PREFER in VTE), worldwide; between 2006 and 2016

Lower extremity DVT or PE, treated with AC for ≥3 months; exclusion criteria: active malignancy

220/15141

0.5 years

Objectively confirmed recurrent DVT or PE

13

5 years

Absolute risk of recurrence with and without extended treatment

Overall 0.68; varying between 0.51 and 0.79 in individual studies

Calibration plots show agreement between the predicted and observed risks, but with substantial heterogeneity between individual studies

External validation based on data from 5 studies; C statistic 0.48–0.71; calibration varying between individual studies

a Data shown as median unless stated otherwise.

b If provided, the optimism-corrected C statistic from internal validation is reported.

Abbreviations: AC, anticoagulation; others, see Table 1

Table 3. Overview of variables included in prediction models for bleeding

Variable

Kuijer et al42

Kearon et al43

RIETE44

ACCP45,46

VTE-BLEED47

EINSTEIN (bleeding in first 3 weeks)48

EINSTEIN (bleeding after 3 weeks)48

EINSTEIN (bleeding in entire period)48

Hokusai49

Seiler et al50

Martinez et al51

Alonoso et al52

PE-SARD53

CHAP model54

VTE-PREDICT33

Clinical variables

General characteristics

Age

x

x

x

x

x

x

x

x

x

x

x

Sex

x

xa

xa

xa

x

x

x

x

Race

x

x

x

Characteristics of index VTE

Type of index VTE

x

x

x

x

Provoked by trauma / surgery

x

Medical history / comorbidities

Active malignancy

x

x

x

x

x

x

x

x

History of malignancy

x

x

(Major) bleeding

x

x

x

x

x

x

x

Gastrointestinal bleeding

x

Peptic ulcer disease

x

Stroke

x

x

x

xa

x

Transient ischemic attack

xa

Cardiovascular disease

x

Hypertension

x

Diabetes

x

x

x

Liver disease

x

x

x

x

Anemia

x

Chronic pulmonary disease

x

x

Dementia

x

Medication use

NSAIDs

x

xa

xb

xa

x

Antiplatelet therapy

x

x

xa

xb

x

xa

x

x

Type of anticoagulant

x

x

x

x

x

Poor INR control

x

x

Other

Fall risk

x

Low physical activity

x

Comorbidity and reduced functional capacity

x

Alcohol abuse

x

x

Syncope

x

Recent surgery

x

Physical examination

Systolic blood pressure

xa

x

x

Body surface

x

Weight

x

x

BMI

x

Laboratory variables

Hemoglobin (anemia)

x

x

x

x

x

xa

xa

x

x

x

x

x

x

x

Hematocrit

Creatinine (renal insufficiency)

x

x

x

x

x

x

x

x

x

Platelet count (thrombocytopenia)

x

x

x

x

a, b Variables denoted with a or b are combined into 1 variable in the model

Abbreviations: INR, international normalized ratio; NSAIDs, nonsteroidal anti-inflammatory drugs; others, see Table 1

Table 4. Overview of development, internal, and external validation of the prediction models for bleeding in patients with venous thromboembolism

Model; author, year

Model development

Model characteristics

Internal validation

External validation

Study design and setting

Population

n (events / total)

Follow-upa

Outcome

Candidate predictors, n

Time horizon

Prediction outcome

Discrimination

Calibration

Kuijer et al,42 1999

RCT (Columbus; LMWH vs UFH), multiple hospitals in 8 countries, between 1994 and 1995

Symptomatic DVT or PE; exclusion criteria: thrombolytic treatment, gastrointestinal bleeding in the past 14 days, surgery in the past 3 days, stroke in the past 10 days, low platelet count, pregnancy, body weight <⁠35 kg

93/1021

0.25 years

All bleeding events during AC; MB defined as clinically overt, Hb decrease >2 g/dl, requiring ≥2 units of blood, retroperitoneal, intracranial, or warranting discontinuation of AC

NA

Initial 3 months

Score of 0–8.8; low risk: <⁠3.75; intermediate risk: 3.75–6.25; high risk: >6.25

0.62 for all bleeding, 0.72 for MB

Not reported

15 studies; C statistic: 0.49–0.68; 3 studies report P value goodness of fit >0.05; 2 studies reported increasing event rate with increasing score, 10 studies did not report calibration

Kearon et al,432003; Gage et al,100 2006b

RCT (ELATE; extended VKA with low vs conventional intensity), Canada and USA, between 1998 and 2001

Unprovoked VTE, treated with AC for 3 months; exclusion criteria: other indications for AC, contraindication for long-term AC including high bleeding risk, antiphospholipid antibodies, life expectancy <⁠2 years

17/738

Mean, 2.4 years

MB (clinically overt, Hb decrease >2 g/dl, requiring ≥2 units of blood or at critical site) during extended AC

Not reported

Not specified

Number of risk factors (max 10)

Not reported

Not reported

5 studies; C statistic: 0.53–0.75; 3 studies reported P value goodness of fit >0.05; 1 study reported increasing event rate with increasing score, 1 study did not report calibration

RIETE; Ruiz-Giménez et al,44 2008

Data from registry (RIETE) of patients with acute VTE, 123 hospitals, mainly Spain, between 2003 and 2007

Acute symptomatic DVT or PE; exclusion criteria: participation in a blinded trial, not available for 3-month follow-up

314/13 057

0.25 years

MB (fatal, clinically overt, requiring ≥2 units of blood, spinal, intracranial or retroperitoneal) during AC

24

Initial 3 months

Score of 0–8; low risk: 0; intermediate risk: 1–4; high risk: >4

Not reported

Increasing incidence of MB at increasing total score

19 studies; C statistic 0.51–0.80; 4 studies reported P value goodness of fit >0.05, underestimation of predicted risks especially at higher risks in 1 study, 1 study reported fluctuating event rate, 1 study reported increasing event rate with increasing score, 12 studies did not report calibration

ACCP; Kearon et al,45,46 2012, 2016

NA: risk factors derived from literature

NA

NA

NA

MB (ISTH) with AC

NA

From fourth month onward

Risk category (low risk: 0 factors, intermediate risk: 1 factor, high risk ≥2 factors

NA

NA

6 studies; C statistic 0.52–0.65, 1 study reported P value goodness of fit >0.05, 1 study reported overestimation of risk above the third decile of predicted risks, 1 study reported increasing event rate except for the highest score, 3 studies did not report calibration

VTE-BLEED; Klok et al,47 2016

Individual patient data from 2 trials (RE-COVER I and RE-COVER II; dabigatran vs standard care), 31 countries worldwide, between 2008 and 2010, model developed in dabigatran arm

Acute symptomatic proximal DVT or PE; exclusion criteria: symptoms > 4 days, hemodynamic instability or need for thrombolytic therapy, other indication for AC, high risk of bleeding, eGFR <⁠30 ml/min/1.73 m2, life expectancy <⁠6 months, pregnancy, long-term antiplatelet therapy

138 (37 MB) /2553 (dabigatran arm); 51 MB /2554 (warfarin arm)

0.5 years

MB (ISTH) and CRNMB (ISTH) during AC

13

From second month onwards

Score of 0–9; low risk: 0–1; high risk: ≥2

MB beyond 30 days: 0.75 (dabigatran), 0.78 (warfarin). All bleeding entire period: 0.72 (dabigatran), 0.59 (warfarin)

Not reported

15 studies; C statistic 0.56–0.75; 2 studies reported P value goodness of fit >0.05; underestimation of predicted risks at higher scores in 1 study, 3 studies reported increasing event rate, with fluctuation in 1 study and except for the highest score in another study, 9 studies did not report calibration

EINSTEIN;

Di Nisio et al,48 2016

Data from 2 trials (EINSTEIN DVT and EINSTEIN PE study; rivaroxaban vs enoxaparin / VKA), 38 countries, between 2007 and 2011

Acute symptomatic DVT or PE; exclusion criteria: fibrinolysis, thrombectomy or vena cava filter, contraindication for enoxaparin or VKA, creatinine clearance <⁠30 ml/min, liver disease, active bleeding, severe hypertension, pregnancy, use of CYP3A4 inhibitor / inducer

112/8245 (63/8060 after 3 weeks)

0.5 years

MB (ISTH) during AC

17

Day 21, between day 21 and day 210, during entire period

Absolute risk of bleeding

0.73 (for the first 3 weeks); 0.68 (after 3 weeks); 0.74 (entire period)

Not reported

1 study validated the model for entire period); C statistic 0.60–0.70; calibration not reported

Hokusai; Di Nisio et al,49 2017

RCT (Hokusai VTE study; edoxaban vs warfarin), 37 countries worldwide, between 2010 and 2012, model developed in edoxaban arm

Acute symptomatic DVT or PE; exclusion criteria: contraindication for AC, treatment for >48 hours with heparin, >1 dose of VKA, cancer, another indication for AC, continued treatment with antiplatelet therapy, eGFR <⁠30 ml/min/1.73 m2

56/4118 (edoxaban arm), 122/8240 (total)

0.75 years

MB (ISTH) and CRNMB (ISTH) during AC

22

During treatment (3–12 months)

Score of 0–5

0.71 for MB; 0.62 for CRNMB; 0.60 in warfarin group

Good model fit according to authors; calibration plot itself not reported; P value goodness of fit test 0.97

1 study; C statistic 0.59–0.61; calibration not reported

Seiler et al,50 2017

Prospective cohort (SWITCO65+), 5 university and 4 nonuniversity hospitals, Switzerland, between 2009 and 2013

Acute symptomatic DVT or PE, age ≥65 years, continuing VKA beyond 3 months; exclusion criteria: conditions incompatible with follow-up (ie, terminal illness), thrombosis at another site than lower limb, catheter related thrombosis

66/743

Mean, 2.3 years

MB (ISTH) during extended AC

17

3 years

Score of 0–8; low risk: 0–1; moderate risk: 2–3; high risk: ≥ 4

0.75 (3 months), 0.69 (6 months), 0.68 (12 and 36 months), 0.67 (24 months)

P value goodness of fit test 0.93

1 study, C statistic 0.66–0.70; P value goodness of fit >0.05

Martinez et al,51 2020

Data from the UK Clinical Practice Research Datalink (CPRD) and Hospital Episodes Statistics (HES), UK, between 2008 and 2016

First VTE, given VKA within 30 days after initial VTE; exclusion criteria: post-thrombotic syndrome, ≥2 VKA prescriptions before initial VTE diagnosis, atrial fibrillation, or cardiac valve replacement

167/10 010

0.25 years

MB (fatal, at a critical site; with hematoma, compartment syndrome, anemia, or transfusion within 7 days; Hb decrease >2 g/dl within 14 days) or hospitalization for CRNMB, during VKA treatment

23

90 days

Score of 0–26; low risk: ≤6, high risk: ≥7

0.68 (0.75 for MB, 0.65 for hospitalization for CRNMB)

P value goodness of fit test 0.38

1 study, C statistic 0.52–0.58; calibration not reported

Alonso et al,52 2021

Data from health insurance claims, US between 2011 and 2017

Diagnosis of VTE and prescription of AC within 1 month after VTE; exclusion criteria: AC use before VTE diagnosis and dabigatran users (because of low number)

2294/16 5434

Mean, 0.4 years

Hospitalization for intracranial hemorrhage, gastrointestinal bleeding, or other MB within first 180 days after VTE

24

0.5 year

Absolute risk of bleeding

0.68 (0.67 at 3 months)

Calibration plot indicated adequate calibration

No external validation

PE-SARD; Chopard et al,53 2021

Data from the BFC-FANCE registry, 5 hospitals, France between 2011 and 2019

Acute PE; exclusion criteria: none

82/2754

2.8 days

MB (ISTH)

13

In-hospital

Score of 0–5; low risk: 0, intermediate risk: 1–2.5; high risk: >2.5

0.74

Observed vs predicted risks for risk categories correspond well; χ2 Hosmer–Lemeshow test 1.99

No external validation

CHAP; Wells et al,54 2022

Prospective cohort study, 12 tertiary care centers in Canada, US, and UK, between 2008 and 2016

Symptomatic unprovoked or weakly provoked DVT or PE, requiring extended anticoagulant therapy beyond 3 months; exclusion criteria: major transient or persistent risk factors (including major surgery, active cancer), MB during initial VTE treatment

118/2516

2.6 years

MB (ISTH) during extended AC

22

1 year (from fourth month onward)

Absolute risk of MB

0.67

Calibration plot indicates good calibration; calibration slope 0.87

No external validation

VTE-PREDICT; De Winter et al,33 2023

Individual patient data from 2 trials (EINSTEIN-CHOICE, GARFIELD-VTE) and 3 cohort studies (Danish registries, MEGA, and Tromsø study), worldwide, between 1977 and 2017

PE or DVT without malignancy

737/15 141

0.5 years

Composite of MB (ISTH) and CRNMB (ISTH)

13

5 years

Absolute risk of bleeding with and without extended treatment

Ranging from 0.65–0.73, overall 0.69

Calibration plots showed agreement between the predicted and observed risks, but with substantial heterogeneity between individual studies

External validation data from 5 studies; C statistic 0.61–0.68; calibration varying between studies (slope 0.55–0.86)

a Data shown as median unless stated otherwise

b Kearon et al first tested these criteria to stratify the risk of bleeding; Gage et al first described a score based on these criteria.

Abbreviations: CRNMB, clinically relevant nonmajor bleeding; eGFR, estimated glomerular filtration rate; Hb, hemoglobin; ISTH, International Society on Thrombosis and Haemostasis; LMWH, low-molecular-weight heparin; MB, major bleeding; NA, not applicable; RCT, randomized controlled trial; UFH, unfractionated heparin; VKA, vitamin K antagonist; others, see Tables 1 and 2

The models for prediction of VTE recurrence and bleeding differ from each other regarding the studied population, included predictors, prediction horizon, and performance during the internal and external validation. Most of these models were recently systematically summarized and critically appraised by de Winter et al.77 We summarize the main differences below.

Models for prediction of venous thromboembolism recurrence

The models for recurrent VTE were developed in different populations. All models are intended for patients with pulmonary embolism (PE) and / or deep vein thrombosis (DVT), except for the Continu-8 model, which was only intended for patients with their first proximal DVT. The L-TRRiP and AIM-SHA-RP models are intended for all patients with the first VTE without malignancy, the Worcester VTE model is intended for all patients with the first VTE, including cancer-associated VTE, whereas the VTE-PREDICT is intended for all VTE patients without malignancy, both with the first or recurrent VTE. The prediction model of the Men and HERDOO2 rule is only intended for women with an unprovoked VTE, as all men with an unprovoked VTE are considered to be at a high risk of recurrence. All other models were intended for patients with their first unprovoked VTE only. However, these models use different definitions of provoked VTE. For instance, in the Vienna score, immobilization or hospitalization are not considered provoking factors, in the HERDOO2 and DASH score estrogen use is not considered; and thrombophilia, which was an exclusion criterion, was defined differently (eg, in the DASH score it was defined as antithrombin deficiency or known antiphospholipid antibodies, whereas the HERDOO2 model, in addition to these factors, excluded patients with protein C or S deficiency, homozygous factor V Leiden or prothrombin mutation, or heterozygous mutation in both genes).78 These different definitions of provoked VTE make these models inconvenient for use in clinical practice, since it is unclear for which patients they can be applied and the definition of unprovoked VTE is not according to the guidance of the International Society on Thrombosis and Haemostasis (ISTH).79

Most models were developed using data from prospective cohort studies. For the development of the DASH score, pre- and post D-dimer models and VTE-PREDICT model, individual patient data from multiple studies including trials were used. The use of randomized clinical trial data for development of prediction models might limit generalizability of the model because of selective patient inclusion or overly specialized predictor measurement.80 The performance of these 3 models during external validation varied across the validation studies with the C statistic ranging from 0.48 to 0.71 (Table 2 and Supplementary material, Table S1). The value of 0.71 originated from an external validation of the VTE-PREDICT model in data from the EINSTEIN-CHOICE, which is also a trial.33 The AIM-SHA-RP model was developed using data from the Danish nationwide registry.3,31 The advantages of such data sources are the availability of a high number of patients and variety of recorded variables, while limitations are data availability for potential candidate predictors and that the predictors from administrative health care data may be measured differently from real world practice, which may reduce generalizability.81 The external validity of the AIM-SHA-RP model has not been determined yet. The Continu-8 model was developed using data from a single-center cohort study, in which patients were treated according to a clinical care pathway, where anticoagulant treatment was tailored by incorporating the presence of residual vein thrombosis.82 Since tailoring the treatment to the presence of residual vein thrombosis is currently not routine practice, this might affect the options to study the external validity of this model.

Sex, age, type, and location of the index event and D-dimer levels are the most used predictors. Next to these, other clinical variables, such as comorbidities, provoking factors, concomitant medication use, several laboratory variables, and genetic variables have been included. Only the pre D-dimer, Worcester VTE, L-TRRiP model D, AIM-SHA-RP, and VTE-PREDICT models use solely clinical variables. The advantage of using clinical variables is that they do not require additional laboratory measurements and therefore are the easiest and most feasible for use in clinical practice. The L-TRRiP model C includes genetic variables, which can be measured during anticoagulant therapy. The HERDOO2 and DAMOVES scores include D-dimer levels measured during anticoagulant treatment. The other models (ie, DASH, Vienna, DAMOVES, post D-dimer, L-TRRiP model A and B) include coagulation measurements, which were obtained after discontinuation of anticoagulant therapy. The Vienna update and post D-dimer model include a variable to account for lag time between discontinuation and D-dimer measurement. For the other models, the D-dimer level was obtained after discontinuation of anticoagulant therapy for a fixed period, which was short (not specified) (Vienna), 3 to 5 weeks (DASH) or 3 months (L-TRRiP model A and B). Since D-dimer values change within 3 months after stopping the anticoagulant treatment,83 they should be obtained at the same time point used in the model development. This would mean that the patients have to discontinue the anticoagulant therapy to obtain the risk score, but afterwards may need to restart the therapy, which is less convenient for clinical practice.

The total number of included predictors ranged from 3 to 16. Within the L-TRRiP models, the most extensive model (A), including 16 predictors, discriminated best (C statistic 0.72), whereas the most basic model (D), including 9 predictors, had the C statistic of 0.69 at the internal validation. This shows that a higher number of predictors might improve the model performance. However, the inclusion of multiple laboratory values might be a barrier for practical implementation, especially if these measurements are not routinely performed or require anticoagulant interruption. Because of this tradeoff between the number of predictors and clinical feasibility, model C was deemed the most useful for clinical practice.11

Almost all models predict the risk of all VTE recurrences, while the L-TRRiP models are restricted to unprovoked recurrences (ie, in the absence of a provoking factor such as malignancy, surgery, pregnancy, hospitalization, or hormone use). The VTE-PREDICT model consists of a score to predict recurrent VTE and a score to predict major bleeding.

Most models consist of a scoring system that calculates a total score, which is then classified as a high or low risk. The Vienna score provides a nomogram to calculate the total score. Only the pre- and post D-dimer models, L-TRRiP models, and VTE-PREDICT model provide the absolute risk of VTE recurrence at 3, 2, and 5 years, respectively. The VTE-PREDICT model can estimate, through an online calculator, the risk of VTE recurrence and bleeding with and without extended anticoagulant therapy.84

The models also differ in performance. The discriminative capacity differed from poor to excellent with the C statistic ranging between 0.56 (AIM-SHA-RP in men) and 0.91 (DAMOVES) during the model development. In the external validation, the C statistic ranged from 0.39 (Vienna update)36 to 0.83 (DAMOVES)38. However, the external validation of DAMOVES was deemed at a high risk of bias by de Winter et al77 due to concerns regarding analysis and lacking the outcome definition. Calibration measures were less often reported in the development and calibration studies. The L-TRRiP models C and D showed to be well calibrated during external validation, although for model C the predicted risks were overestimated in the highest risk quintile. The calibration of the VTE-PREDICT model differed across the populations used for the external validation; for example, calibration plots indicated underestimation of the predicted recurrence risks in the patients with higher risks in the MEGA study, whereas this risk was overestimated in these patients in the GARFIELD-VTE study.

According to de Winter et al,77 only the L-TRRiP and pre- and post D-dimer models had an overall low risk of bias, whereas the other models published before 2020 were judged to be at a high risk of bias. This was mainly due to the statistical analyses, including concerns on handling of missing data and a risk of overfitting.

Models for prediction of bleeding

Most bleeding risk models that were developed solely for VTE patients, are intended for adult patients with their first or recurrent symptomatic VTE, including PE and / or DVT. The PE-SARD model was only developed for patients with acute PE. The model by Seiler et al50 was developed for patients aged 65 years or older. The models by Martinez et al51 and the CHAP model were developed for patients with the first VTE, the model by Alonso et al52 probably included patients with the first VTE, as the patients with previous anticoagulant use were excluded, but this was not stated explicitly. In addition, the model by Kearon et al43 was developed for patients with unprovoked VTE only, whereas the CHAP model was developed for patients with an unprovoked or weakly provoked the first VTE.

Most models were developed using data from clinical trials. Many of these trials excluded patients at high risk of bleeding, for instance by excluding individuals with recent major bleeding, severe renal insufficiency, active cancer, or on antiplatelet therapy. The RIETE, Seiler et al,50 PE-SARD, and CHAP models were developed using data from prospective cohort studies or registries, whereas the models by Martinez et al51 and Alonso et al52 were developed using routine health care data.

Age, history of (major) bleeding, active malignancy, antiplatelet therapy, and the presence of anemia or renal insufficiency were most often included as predictors in the models. All included variables are clinical parameters or routinely assessed laboratory measurements (hemoglobin, creatinine, and platelet count). A total number of the included predictors ranged from 3 to 16.

All the models developed for VTE patients included major bleeding in the outcome definition, while approximately half of the models also included clinically relevant nonmajor bleeding. The Kuijer et al,42 RIETE, and Martinez et al51 scores were only developed to predict bleeding within the first 90 days of treatment. The PE-SARD model was only intended to predict in-hospital bleeding during hospitalization for the index PE. Although these models might also predict long-term bleeding outcomes, this should first be demonstrated during external validation studies with long-term follow-up before they can be used for clinical decision making regarding the benefit of extended anticoagulant therapy. In addition, many of the development studies, as well as validation studies, had a median follow-up below 1 year, which makes the long-term performance of the models uncertain. Only the models by Kearon et al,43 Seiler et al,50 and the CHAP model were developed using data with a median follow-up longer than 2 years. All the scores, except for EINSTEIN,48 Seiler et al,50 Alonso et al,52 PE-SARD, and CHAP have been validated at least once in a cohort with a median follow-up longer than 1 year.

Almost all models only predict bleeding during anticoagulant therapy. Only the score by Alonso et al52 was intended to predict bleeding in the first 180 days after VTE diagnosis, irrespective of the duration of anticoagulant use. The scores by Kearon et al,43 Seiler et al,50 and CHAP were only intended for the prediction of bleeding during extended anticoagulant therapy (ie, beyond the initial treatment phase of 3 months). The VTE-PREDICT score provides the risk of bleeding with and without extended treatment. Kuijer et al,42 Kearon et al,43 RIETE, Seiler et al,50 and Martinez et al51 models did not include patients using DOACs. This might affect the performance of these models in current clinical practice, where DOACs are generally the preferred treatment. During development of the VTE-BLEED score performance was assessed separately for patients on a DOAC and a VKA, which showed a relevant difference in the C statistic of 0.72 vs 0.59 in dabigatran and warfarin users, respectively.47 This illustrates that the external validity of such scores in DOAC users should be evaluated before these models can be implemented in current clinical practice.

Most of the bleeding risk models only provide a scoring system to classify patients at a low or high risk of bleeding. Only the EINSTEIN, Alonso et al,52 CHAP, and VTE-PREDICT models provide a formula to calculate the absolute risk of (major) bleeding at 21 or 210 days (EINSTEIN), 1 year (Alonso et al52 and CHAP) or 5 years, respectively.

For the models developed for VTE patients, the C statistic from the internal validation ranged from 0.59 (VTE-BLEED for all types of bleeding in warfarin users during entire period) to 0.78 (VTE-BLEED score for major bleeding during stable anticoagulation in warfarin users).47 Calibration plots were only provided for the score by Alonso et al,52 CHAP, and VTE-PREDICT models, and indicated adequate to good calibration. The PE-SARD model showed good agreement between the predicted and observed risks stratified by risk category. The C statistic values from the external validation were generally lower, ranging from 0.49 (1 validation of Kuijer et al55) to 0.80 (1 validation of RIETE70). The last value was found during a validation study of the RIETE model70 in the same registry as the original development study, only with a longer inclusion period, which does not make the cohort as independent as one would prefer for an external validation. Within the external validation studies with a median follow-up above 1 year, the C statistic ranged from 0.51 (RIETE)54 to 0.65 (ACCP)54.

All derivation studies published before 2020 were judged to have a high risk of bias due to factors regarding the statistical analysis, as critically appraised by de Winter et al.77

The models that were developed in other patient groups using anticoagulation but validated in VTE patients showed C statistic values ranging from 0.47 (OBRI)72 to 0.81 (HAS-BLED)74 during the external validation in a VTE population, indicating they might also be able to predict the risk of bleeding in VTE patients. However, as in the models intended solely for VTE patients, most external validation studies had a follow-up shorter than 1 year, and therefore their long-term performance is uncertain at best.

How should we proceed to implement prediction models?

Even though there are many models for the prediction of recurrent VTE and bleeding available, they are seldom used in daily clinical practice to determine treatment duration after the first VTE,85,86 and none of them has been incorporated in the current guidelines. The main reason for this is a lack of sufficiently accurate and validated models with the added value demonstrated in clinical practice. For example, the National Institute for Health and Care Excellence committee stated in 2020 that the current models were not sufficiently accurate or validated to be used as the sole basis for a decision on treatment duration, and they recommended further research to compare the prognostic accuracy of the prediction models and the clinical judgement.8 Likewise, the American Society of Hematology guideline (2020) suggests against routine use of prognostic scores, because evidence on the impact of prognostic scores is lacking.7 The Subcommittee on Predictive and Diagnostic Variables in Thrombotic Disease of the International Society on Thrombosis and Haemostasis Scientific and Standardization Committee has suggested to routinely assess bleeding risk in all VTE patients in a standardized way, preferably with the use of a validated prediction model to support anticoagulation management decisions.87 Another reason why the models are not regularly used and implemented in guidelines might be that most of them consist of a scoring system that does not provide the absolute risk of recurrence or bleeding, which makes it difficult to balance these risks. Physicians also report that they do not use the prediction models, because they do not know how to combine and translate these scores into clinical practice.85

Implementation studies

The effect of implementation of the model on outcomes in clinical practice has been studied for very few models. The Men and HERDOO2 rule was evaluated in a management study including 2785 participants with their first unprovoked VTE. Women with a low risk of recurrent VTE according to the HERDOO2 criteria discontinued anticoagulant therapy, whereas management for men and high-risk women was left at the discretion of the treating physician. In the low-risk women who discontinued anticoagulants, the VTE recurrence rate was 3 per 100 patient years (py). In men and high-risk women this was 8.1/100 py for those who discontinued and 1.6/100 py for those who continued the treatment.88 This study showed that discontinuation of anticoagulant therapy was safe for women with unprovoked VTE with a low recurrence risk, as the recurrence rate after discontinuation was low. However, the limitation of the HERDOO2 rule is that women with VTE during estrogen use were classified as having unprovoked VTE. These women accounted for more than half of the low-risk group and had a very low risk of VTE recurrence (1.4/100 py). The low-risk women aged below 50 years without estrogen use had a recurrence risk of 3.1/100 py. The recurrence risk in the women aged 50 years or older without estrogen use, who were classified as low-risk, was 6.8/100 py, which is actually an intermediate recurrence risk. These results again illustrate that risk classification becomes more accurate when more factors are taken into account rather than just sex and the presence of provoking factors.

The VISTA randomized controlled trial compared the risk of VTE recurrence in patients with unprovoked VTE for whom treatment duration was based on the Vienna model, with treatment duration according to usual care.89 In this trial, 441 patients and their treating physicians received the results of risk calculation using the Vienna model accompanied with a discussion on the clinical consequences of this risk. The other 442 patients received standard care. The cumulative incidences of recurrent VTE in the Vienna group (10.4%) and control group (11.3%) were similar, although more patients in the Vienna group continued anticoagulant treatment.89 Although there are several limitations, including a moderate adherence rate and premature termination of the trial due to dropping accrual rate, this trial did not show an advantage of using the Vienna model in treatment decisions versus the usual care. Given the reasonable performance in the external validation, this result was not expected.

Future perspective

To enable tailored treatment based on individual prediction of recurrent VTE and bleeding risk, the added value of prediction scores should be demonstrated by implementation or management studies using the existing models. Ideally, both the risk of recurrent VTE and (major) bleeding should be considered. Currently, the authors perform such a trial in which the advice to stop or continue anticoagulant treatment is based on the risk of recurrent VTE and major bleeding as estimated by the L-TRRiP and VTE-BLEED scores, respectively (Netherlands trial register: NL9003).

In addition, the prediction of both recurrent VTE and major bleeding should be improved since the current prediction models are still suboptimal: almost all models perform only modestly with the C statistic around 0.55 to 0.65 during the external validation, and none of the models repeatedly showed the C statistic value exceeding 0.75 during the external validation. Furthermore, several recent models have not been externally validated yet. As described above, many of the models show limitations in methodology or convenience in clinical use. Therefore, we should aim to improve the prediction of recurrent VTE and (major) bleeding, preferably by updating current models, and otherwise by developing new models according to current development standards. However, despite these limitations, the current models might still discriminate better between patients with high and low risk of recurrence than the current provoked / unprovoked distinction that is made by the guidelines. This is, for example, shown in the development study of the L-TRRiP model, where the L-TRRiP models C and D showed the C statistic of 0.69, whereas the C statistic of the provoked / unprovoked status was 0.61.11

The current models might be improved by adding additional variables or updating the model coefficients.90 For instance, Raj et al41 performed a validation study of the HERDOO2, DASH, and VIENNA models, but also assessed the added value of incorporating the pulmonary vascular obstruction index into these models. This resulted in improved model performance, as shown by the increases in the C statistic ranging from 0.06 to 0.11 points. Although this analysis was limited because only PE patients were included, similar approaches including parameters from diagnostic imaging, genetic markers,91 or proteomics92 might improve the model performance. Likewise, development of new models using novel modelling approaches, such as artificial intelligence, might improve predictions.93 These complex models can be implemented more easily nowadays, as they can be made available through applications or web-based calculators. However, as in the case of the existing models, the newly developed or updated models should be externally validated and added value in clinical practice should be demonstrated before their implementation in clinical practice.

Lastly, to enable tailored treatment after the first VTE, we should also consider other relevant outcomes, such as post-thrombotic syndrome and post-PE syndrome, which have a considerable impact on the quality of life of the patients.94,95 These long-term sequels of VTE have shared risk factors with recurrent VTE,16,94-96 and in addition they occur more often after recurrent VTE.97,98 Therefore, the efforts to improve treatment after the first VTE should not only focus on anticoagulant treatment duration, VTE recurrence, and bleeding, but also on other treatment modalities and outcomes.99

Conclusions

To conclude, to improve current long-term outcomes after the first VTE, optimal discrimination of patients that would and would not benefit from prolonged anticoagulant treatment is necessary. Prediction models are a promising option to improve the decision making for indefinite anticoagulant therapy in these patients. However, before the prediction models can be implemented in guidelines and routine clinical practice, their added value should be assessed by implementation studies. Furthermore, there is still room for improvement of the current models and their prediction quality.