Skip to main content

Table 2 Development analysis characteristics of the 62 included publications, by study type

From: Risk of bias of prognostic models developed using machine learning: a systematic review in oncology

  All (n = 62) Development only (n = 48) Development and external validation (n = 14)
  n (%) n (%) n (%)
Development characteristics
Data source*
  Randomised controlled trial 1 (1.6) - 1 (7.1)
  Prospective cohort 9 (14.5) 9 (18.8) -
  Retrospective cohort 14 (22.6) 11 (22.9) 3 (21.4)
  Registry 21 (33.9) 15 (31.3) 6 (42.9)
  Routine care database 9 (14.5) 7 (14.6) 2 (14.3)
  Other** 3 (4.8) 2 (4.2) 1 (7.1)
  Unclear 5 (8.1) 4 (8.3) 1 (7.1)
Setting***
  Primary care 2 (3.2) 2 (4.2) -
  Secondary care 36 (58.1) 29 (60.4) 7 (50)
  Tertiary care 10 (16.1) 7 (14.6) 3 (21.4)
  General population 5 (8.1) 3 (6.3) 2 (14.3)
  Other**** 3 (4.8) 3 (6.3) -
  Unclear 6 (9.7) 4 (8.3) 2 (14.3)
Multicentre*****
  No 26 (41.9) 24 (50) 2 (14.3)
  Yes 13 (21) 7 (14.6) 6 (42.9)
  Unclear 23 (37.1) 17 (35.4) 6 (42.9)
Geographic location******
  South America 2 (3.2) 2 (4.2) -
  Asia 8 (12.9) 6 (12.5) 2 (14.3)
  Europe 13 (21) 13 (27.1) -
  Canada 3 (4.8) 3 (6.3) -
  USA 21 (33.9) 15 (31.3) 6 (42.9)
  Europe, North America, Australia 1 (1.6) 1 (2.1) -
  Europe, South America 1 (1.6) - 1 (7.1)
  South Asia, USA 1 (1.6) 1 (2.1) -
  Unclear 12 (19.4) 7 (14.6) 5 (35.7)
Intended user
  Health care providers 34 (54.8) 27 (56.3) 7 (50)
  Public/patients 2 (3.2) 2 (4.2) -
  Researchers 1 (1.6) 1 (2.1) -
  Health care providers and patient/public 4 (6.5) 1 (2.1) 3 (21.4)
  Health care providers and researchers 2 (3.2) 2 (4.2) -
  Unclear 19 (30.6) 15 (31.3) 4 (28.6)
  Aim of model    
  Predict risk 36 (58.1) 25 (52.1) 11 (78.6)
  Classify patients 25 (40.3) 23 (47.9) 2 (14.3)
  Predict length of stay (continuous outcome) 1 (1.6) - 1 (7.1)
  1. *Validation characteristics for data source are: Randomised controlled trial: 2/14 (14.3%); Prospective cohort: 3/14 (21.4%); Retrospective cohort: 4/14 (28.6%); Registry: 2/14 (14.3%); Routine care database: 2/14 (14.3%); Other (survey): 1/14 (7.1%)
  2. **Other includes audit, survey and a combination data source of hospital and research data and a registry
  3. ***Validation characteristics for setting are: Secondary care: 7/14 (50%); Tertiary care: 4/14 (28.8%); General population: 2/14 (14.3%); Unclear: 1/14 (7.1%)
  4. ****Other includes combination of hospitals, hospices and nursing homes, NTT medical center in Tokyo and combination of primary and tertiary care
  5. *****Validation characteristics for multicentre are: No: 8/14 (57.1%); Yes: 3/14 (21.4%); Unclear: 3/14 (21.4%)
  6. ******Validation characteristics for geographical location are: South America: 1/14 (7.1%); Asia: 5/14 (35.7%); USA: 5/14 (35.7%); Unclear: 3/14 (21.4%)