Skip to main content

Table 7 Training and testing data size and method used to split training and testing data

From: A scoping methodological review of simulation studies comparing statistical and machine learning approaches to risk prediction for time-to-event data

 

Training and testing datasets

Training data size

Testing data size

Method

Geng et al. (2014)* [26]

100

200

1000

1000

Independent samples from DGM

Golmakani et al. (2020) [31]

450

720

50

80

10-fold cross-validation

Gong et al. (2018) [27]

200

400

500

600

800

1000

200

400

500

600

800

1000

Independent samples from DGM

Hu and Steingrimsson (2018) [28]

200

500

1000

1000

Independent samples from DGM

Katzman et al. (2018)* [29]

4000

1000

Independent samples from DGM

Lowsky et al. (2012)* [25]

500

1000

3000

7500

13525

13525

13525

13525

Independent samples from DGM

Omurlu et al. (2009) [24]

50

100

250

500

50

100

250

500

Unclear

Steingrimsson and Morrison (2020) [32]

250

500

1000

1500

3000

250

500

1000

1500

3000

Independent samples from DGM

Wang and Li (2019) [30]

150

150

Two-fold cross-validation

Xiang et al. (2000) [23]

100

200

100

200

Randomly split whole sample into equal training and testing sets

  1. *These articles also included validation datasets