Abrahamowicz M, du Berger R, Grover SA. Flexible modelling of the effects of serum cholesterol on coronary heart disease mortality. Am J Epidemiol. 1997;145:714–29.

Article
CAS
PubMed
Google Scholar

Altman DG, Andersen PK. Bootstrap investigation of the stability of a Cox regression model. Stat Med. 1989;8:771–83.

Article
CAS
PubMed
Google Scholar

Altman DG, Lausen B, Sauerbrei W, Schumacher M. The dangers of using ‘optimal’cutpoints in the evaluation of prognostic factors. J Nat Cancer Inst. 1994;86:829–35.

Article
CAS
PubMed
Google Scholar

Altman DG, McShane LM, Sauerbrei W, Taube SE. Reporting recommendations for tumor marker prognostic studies (REMARK): explanation and elaboration. PLoS Med. 2012;9:e1001216.

Article
PubMed
PubMed Central
Google Scholar

Antoniadis A, Gijbels I, Verhasselt A. Variable selection in additive models using P-splines. Technometrics. 2012;54:425–38.

Article
Google Scholar

Arem H, Moore SC, Patel A, Hartge P, Berrington DE, Gonzalez A, Visvanathan K, Campbell PT, Freedman M, Weiderpass E, Adami HO, Linet MS, Lee IM, Matthews CE. Leisure Time physical activity and mortality. A detailed pooled analysis of the dose-response relationship. JAMA Intern Med. 2015;175:959–67.

Article
PubMed
PubMed Central
Google Scholar

Augustin N, Sauerbrei W, Schumacher M. The practical utility of incorporating model selection uncertainty into prognostic models for survival data. Stat Model. 2015;5:95–118.

Article
Google Scholar

Becher H. Analysis of continuous covariates and dose-effect analysis. In: Ahrens W, Pigeot I (Eds) Handbook of epidemiology. 2nd edition. Heidelberg: Springer Verlag; 2014.

Becher H, Lorenz E, Royston P, Sauerbrei W. Analysing covariates with spike at zero. a modified FP procedure and conceptual issues. Biometrical J. 2012;54:686–700.

Article
Google Scholar

Benedetti A, Abrahamowicz M. Using generalized additive models to reduce residual confounding. Stat Med. 2004;23:3781–801.

Article
PubMed
Google Scholar

Binder H, Schumacher M. Allowing for mandatory covariates in boosting estimation of sparse high-dimensional survival models. BMC Bioinform. 2008;9:14.

Article
Google Scholar

Binder H, Sauerbrei W, Royston P. Comparison between splines and fractional polynomials for multivariable model building with continuous covariates: a simulation study with continuous response. Stat Med. 2013;32:2262–77.

Article
PubMed
Google Scholar

Boulesteix AL, Binder H, Abrahamowicz M, Sauerbrei W. On the necessity and design of studies comparing statistical methods. Biometrical J. 2018;60:216–8.

Article
Google Scholar

Breiman L. The little bootstrap and other methods for dimensionality selection in regression: X-fixed prediction error. J Am Stat Assoc. 1992;87:738–54.

Article
Google Scholar

Breiman L. Better subset regression using the non- negative Garrote. Technometrics. 1995;37:373–84.

Article
Google Scholar

Breiman L. Statistical Modeling: The two cultures. Stat Sci. 2001;16:199–231.

Article
Google Scholar

Buckland ST, Burnham KP, Augustin NH. Model selection: an integral part of inference. Biometrics. 1997;53:603–18.

Article
Google Scholar

Bühlmann P. Hothorn. Boosting algorithms: regularization, prediction and model fitting. Stat Sci. 2007;22:477–505.

Article
Google Scholar

Burnham KP, Anderson DR. Model selection and multimodel inference: a practical information- theoretic approach. New York: Springer; 2002.

Google Scholar

Bursac Z, Gauss CH, Williams DK, Hosmer DW. Purposeful selection of variables in logistic regression. Source Code Biol Med. 2008;3:17.

Article
PubMed
PubMed Central
Google Scholar

Chatfield C. Model uncertainty, data mining and statistical inference (with discussion). J Royal Stat Soc Series B. 1995;158:419–66.

Article
Google Scholar

Chatield C. Confessions of a pragmatic statistician. Statistician. 2002;51:1–20.

Google Scholar

Chen C, George SL. The bootstrap and identification of prognostic factors via Cox’s proportional hazards regression model. Stat Med. 1985;4:39–46.

Article
CAS
PubMed
Google Scholar

Chouldechova A, Hastie T. Generalized additive model selection. arXiv preprint 2015;arXiv:1506.03850.

Copas JB, Long T. Estimating the residual variance in orthogonal regression with variable selection. Journal of the Royal Statistical Society. Series D (The Statistician). 1991;40:51-59.

Cox DR. Comment on Breiman, L. (2001). Statistical modeling: the two cultures. Stat Sci. 2001;16:216–8.

Google Scholar

Dakna M, Harris K, Kalousi A, Carpentier S, Kolch W, Schanstra JP, Haubitz M, Vlahou A, Mischak H, Girolami M. Addressing the challenge of defining valid proteomic biomarkers and classifiers. BMC Bioinform. 2010;11:594.

Article
Google Scholar

de Bin R, Sauerbrei W. Handling co-dependence issues in resampling-based variable selection procedures: a simulation study. J Stat Comput Simul. 2018:8828–55.

de Bin R, Janitza S, Sauerbrei W, Boulesteix AL. Subsampling versus bootstrapping in resampling-based model selection for multivariable regression. Biometrics. 2016;72:272–80.

Article
PubMed
Google Scholar

de Boor C. A practical guide to splines revised. Revised Edition. New York: Springer; 2001.

Google Scholar

Dorie V, Hill J, Shalit U, Scott M, Cervone D. Automated versus do-it-yourself methods for causal inference: lessons learned from a data analysis competition. Stat Sci. 2019;34:43–68.

Article
Google Scholar

Draper D. Assessment and propagation of model selection uncertainty (with) discussion. J Royal Stat Soc Series B. 1995;57:45–97.

Google Scholar

Dunkler D, Plischke M, Leffondré K, Heinze G. Augmented backward elimination: a pragmatic and purposeful way to develop statistical models. PLoS ONE. 2014;9:e113677.

Article
CAS
PubMed
PubMed Central
Google Scholar

Dunkler D, Sauerbrei W, Heinze G. Global, parameterwise and joint shrinkage factor estimation. J Stat Softw. 2016;69:1–19.

Article
Google Scholar

Efron B. Comment on Breiman, L. (2001). Statistical modeling: the two cultures. Stat Sci. 2001;16:218–9.

Google Scholar

Efron B, Hastie T, Johnstone I, Tibshirani R. Least angle regression. Ann Statist. 2004;32:407–99.

Article
Google Scholar

Efroymson MA. Multiple regression analysis. in: Ralston A and Wilf HS(ed.). Mathematical methods for digital computers. John Wiley. New York; 1960.

Eilers PHC, Marx BD. Flexible smoothing with B-splines and penalties (with comments and rejoinder). Stat Sci. 1996;11:89–121.

Article
Google Scholar

Fan J, Li R. Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc. 2001;96:1348–60.

Article
Google Scholar

Freund Y, Schapire R. Experiments with a new boosting algorithm. In: Proceedings of the Thirteenth International Conference on Machine Learning Theory. San Francisco, CA: Morgan Kaufmann Publishers Inc; 1996.

Friedman JH. Greedy function approximation: a gradient boosting machine. Ann Stat. 2001;29:1189–232.

Article
Google Scholar

Friedman JH, Hastie T, Tibshirani R. Additive logistic regression: a statistical view of boosting (with discussion). Ann Stat. 2000;28:337–407.

Article
Google Scholar

Friedman J, Hastie T, Tibshirani R. Regularization paths for generalized linear models via coordinate descent. J Stat Softw. 2010;33:1–22.

Article
PubMed
PubMed Central
Google Scholar

Fröhlich H. Including network knowledge into Cox regression models for biomarker signature discovery. Biometrical J. 2014;56:287–306.

Article
Google Scholar

Gong G. Some ideas on using the bootstrap in assessing model variability. In: Heiner KW, Sacher RS, Wilkinson JW, editors. Computer Science and Statistics: Proceedings of the 14^{th} Symposium on the Interface. NewYork: Springer; 1982.

Google Scholar

Good DM, Zürbig P, Argilés A, Bauer HW, Behrens G, Coon JJ, Dakna M, Decramer S, Delles C, Dominiczak AF, Ehrich JHH. Naturally occurring human urinary peptides for use in diagnosis of chronic kidney disease. Mol Cell Proteomic. 2010;9:2424–37.

Article
Google Scholar

Greenland S. Avoiding power loss associated with categorization and ordinal scores in dose-response and trend analysis. Epidemiology. 1995;6:450–4.

Article
CAS
PubMed
Google Scholar

Groenwold RHH, Klungel OH, van der Graaf Y, Hoes AW, Moons KGM. Adjustment for continuous confounders: an example of how to prevent residual confounding. Can Med Assoc J. 2013;185:401–6.

Article
Google Scholar

Harrell FE. Regression modeling strategies. In: With applications to linear models, logistic and ordinal regression, and survival analysis. New York: Springer; 2001.

Google Scholar

Harrell FE. Regression modeling strategies: with applications to linear models, logistic and ordinal regression, and survival analysis. 2nd ed. New York: Springer; 2015.

Book
Google Scholar

Harrell FE, Lee KL, Mark DB. Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat Med. 1996;15:361–87.

Article
PubMed
Google Scholar

Hastie T, Tibshirani R. Generalized additive models. New York: Chapman & Hall/CRC; 1990.

Google Scholar

Hastie T, Tibshirani R, Friedman J. The elements of statistical learning. 2nd ed. New York: Springer; 2009.

Book
Google Scholar

Hastie T, Tibshirani R, Wainwright M. Statistical learning with Sparsity: The lasso and generalizations. CRC Press LLC: Boca Raton. Monographs on statistics and applied probability; 2015.

Heinze G, Dunkler D. Five myths about variable selection. Transplant Int. 2017;30:6–10.

Article
Google Scholar

Heinze G, Wallisch C, Dunkler D. Variable selection – a review and recommendations for the practicing statistician. Biometrical J. 2018;60:431–49.

Article
Google Scholar

Hilsenbeck SG, Clark GM, Mcguire W. Why do so many prognostic factors fail to pan out? Breast Cancer Res Treat. 1992;22:197–206.

Article
CAS
PubMed
Google Scholar

Hoerl AE, Kennard RW. Ridge Regression: biased estimation for nonorthogonal problems. Technometrics. 1970;12:55–67.

Article
Google Scholar

Hoeting JA, Madigan D, Raftery AE, Volinsky CT. Bayesian model averaging: a tutorial. Stat Sci. 1999;14:382–417.

Article
Google Scholar

Hofner B, Hothorn T, Kneib T, Schmid M. A framework for unbiased model selection based on boosting. J Comput Graphical Stat. 2011;20:956–71.

Article
Google Scholar

Hosmer D, Lemeshow S, May S. Applied survival analysis (2^{nd} ed.). Wiley. Hoboken, NJ; 2008.

Hosmer D, Lemeshow S, Sturdivant RX. Applied logistic regression. 3rd ed. Hoboken: Wiley; 2013.

Book
Google Scholar

Huebner M, le Cessie S, Schmidt C, Vach W, On behalf of the Topic Group “Initial Data Analysis” of the STRATOS Initiative. A Contemporary Conceptual Framework for Initial Data Analysis. Observational Studies. 2018;4:171–92.

Google Scholar

Janitza S, Binder H, Boulesteix AL. Pitfalls of hypothesis tests and model selection on boot- strap samples: causes and consequences in biometrical applications. Biometrical J. 2016;58:447–73.

Article
Google Scholar

Jenkner C, Lorenz E, Becher H, Sauerbrei W. Modeling continuous covariates with a ‘spike‘at zero: bivariate approaches. Biometrical J. 2016;58:783–96.

Article
Google Scholar

Lee PH. Is a cutoff of 10% appropriate for the change-in-estimate criterion of confounder identification? J Epidemiol. 2014;24:161–7.

Article
PubMed
PubMed Central
Google Scholar

Leeb H, Pötscher BM. Model selection and inference: facts and fiction. Econometric Theory. 2005;21:21–59.

Article
Google Scholar

Leffondre K, Abrahamowicz M, Siemiatycki J, Rachet B. Modeling smoking history: a comparison of different approaches. Am J Epidemiol. 2002;156:813–23.

Article
PubMed
Google Scholar

Lin Y, Zhang HH. Component selection and smoothing in multivariate nonparametric American Journal of Epidemiology regression. Ann Stat. 2006;34:2272–97.

Article
Google Scholar

Lorenz E, Jenkner C, Sauerbrei W, Becher H. Modeling variables with a spike at zero. Examples and practical recommendations. Am J Epidemiol. 2017;185:1–39.

Article
Google Scholar

Maldonado G, Greenland S. Simulation of confounder-selection strategies. Am J Epidemiol. 1993;138:923–36.

Article
CAS
PubMed
Google Scholar

Mallows CL. The zeroth problem. Am Stat. 1998;52:1–9.

Google Scholar

Mantel N. Why stepdown procedures in variable selection? Technometrics. 1970;12:621–5.

Article
Google Scholar

Marcus R, Peritz E, Gabriel KR. On closed test procedures with special reference toordered analysis of variance. Biometrika. 1976;76:655–60.

Article
Google Scholar

Marra G, Wood SN. Practical variable selection for generalized additive models. Comput Stat Data Anal. 2011;55:2372–87.

Article
Google Scholar

Mayr A, Binder H, Gefeller O, Schmid M. The Evolution of boosting algorithms – from machine learning to statistical modelling. Methods Inf Med. 2014;53:419–27.

Article
CAS
PubMed
Google Scholar

Meier L, van de Geer S, Bühlmann P. High-dimensional additive modeling. Ann Stat. 2009;37:3779–821.

Article
Google Scholar

Meinshausen N, Bühlmann P. Stability selection. J Stat Soc Series B Stat Methodol. 2010;72:417–73.

Article
Google Scholar

Miller A. Selection of subsets of regression variables. Journal of the Royal Statistical Society. Series A (General). 1984;147:389–425.

Article
Google Scholar

Miller R, Siegmund D. Maximally selected chi-square statistics. Biometrics. 1982;38:1011–6.

Article
Google Scholar

Moons KG, Altman KG, Reitsma JB, Ioannidis JP, Macaskill P, Steyerberg EW, Vickers AJ, Ransohoff DF, Collins GGS. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and Elaboration. Ann Intern Med. 2015;162:W1–W73.

Article
PubMed
Google Scholar

Morris T, White IR, Crowther MJ. Using simulation studies to evaluate statistical methods. Stat Med. 2019;38:2074–102.

Article
PubMed
PubMed Central
Google Scholar

Nkuipou-Kenfack E, Zürbig P, Mischak H. The long path towards implementation of clinical proteomics: exemplified based on CKD273. Proteomics Clin Appl. 2017;11:5–6.

Google Scholar

Perperoglou A, Sauerbrei W, Abrahamowicz M, Schmid M, on behalf of TG2 of the STRATOS initiative. A review of spline function procedures in R. BMC Med Res Methodol. 2019;19:46.

Article
PubMed
PubMed Central
Google Scholar

Picard RP, Cook RD. Cross-validation of regression models. J Am Stat Assoc. 1984;79:575–83.

Article
Google Scholar

Pullenayegum EM, Platt RW, Barwick M, Feldman BM, Offringa M, Thabane L. Knowledge translation in biostatistics: a survey of current practices, preferences, and barriers to the dissemination and uptake of new statistical methods. Stat Med. 2015;35:805–18.

Article
PubMed
Google Scholar

Raftery AE. Bayesian model selection in social research. Sociol Methodol. 1995;25:111–63.

Article
Google Scholar

Ramaiola I, Padró T, Peña E, Juan-Babot O, Cubedo J, Martin-Yuste V, Sabate M, Badimon L. Changes in thrombus composition and profilin-1 release in acute myocardial infarction. Eur Heart J. 2015;36:965–75.

Article
CAS
PubMed
Google Scholar

Ramsay JO. Monotone regression splines in action. Stat Sci. 1988;3:425–41.

Article
Google Scholar

Ravikumar P, Liu H, Lafferty J, Wasserman L. Spam. Sparse additive models. In Advances in Neural Information Processing Systems. Vol. 20 (eds J. Platt, D. Koller, Y. Singer S. Roweis). Cambridge, MIT Press; 2008.

Rosenberg PS, Katki H, Swanson CA, Brown LM, Wacholder S, Hoover RN. Quantifying epidemiologic risk factors using non-parametric regression: model selection remains the greatest challenge. Stat Med. 2003;22:3369–81.

Article
PubMed
Google Scholar

Rospleszcz S, Janitza S, Boulesteix AL. Categorical variables with many categories are preferentially selected in bootstrap-based model selection procedures for multivariable regression models. Biometrical J. 2016;58:652–73.

Article
Google Scholar

Royston P, Altman DG. Regression using fractional polynomials of continuous covariates: parsimonious parametric modelling. Appl Stat. 1994;43:429–67.

Article
Google Scholar

Royston P, Altman DG, Sauerbrei W. Dichotomizing continuous predictors in multiple regression: a bad idea. Stat Med. 2006;25:127–41.

Article
PubMed
Google Scholar

Royston P, Sauerbrei W. Multivariable modelling with cubic regression splines: a principled approach. Stata J. 2007;7:45–70.

Article
Google Scholar

Royston P, Sauerbrei W. Multivariable model-building. a pragmatic approach to regression analysis based on fractional polynomials for continuous variables. Wiley, Chichester; 2008.

Sauerbrei W. The use of resampling methods to simplify regression models in medical statistics. Appl Stat. 1999;48:313–29.

Google Scholar

Sauerbrei W, Abrahamowicz M, Altman DG, le Cessie S, Carpenter J, on behalf of the STRATOS initiative. STRengthening Analytical Thinking for Observational Studies: The STRATOS initiative. Stat Med. 2014;33:5413–32.

Article
PubMed
PubMed Central
Google Scholar

Sauerbrei W, Buchholz A, Boulesteix AL, Binder H. On stability issues in deriving multivariable regression models. Biometrical J. 2015:57531–55.

Sauerbrei W, Meier-Hirmer C, Benner A, Royston P. Multivariable regression model building by using fractional polynomials: description of SAS, STATA and R programs. Comput Stat Data Anal. 2006;50:3464–85.

Article
Google Scholar

Sauerbrei W, Royston P. Building multivariable prognostic and diagnostic models: transformation of the predictors by using fractional polynomials. J Royal Stat Soc A. 1999;162:71–94.

Article
Google Scholar

Sauerbrei W, Royston P, Binder H. Selection of important variables and determination of functional form for continuous predictors in multivariable model building. Stat Med. 2007;26:5512–28.

Article
PubMed
Google Scholar

Sauerbrei W, Schumacher M. A bootstrap resampling procedure for model building: application to the cox regression model. Stat Med. 1992;11:2093–109.

Article
CAS
PubMed
Google Scholar

Schmid M, Hothorn T. Boosting additive models using componentwise P-splines. Comput Stat Data Anal. 2008;53:298–311.

Article
Google Scholar

Shaw PA, Deffner V, Dodd KW, Freedman LS, Keogh R, Kipnis V, Küchenhoff H, Tooze JA, on behalf of Measurement Error Working group (TG4) of the STRATOS initiative. Epidemiological analyses with error prone exposures: review of current practise and recommendations. Ann Epidemiol. 2018;28:82–828.

Article
Google Scholar

Shmueli G. To explain or to predict? Stat Sci. 2010;25:289–310.

Article
Google Scholar

Smith GCS, Seaman SR, Wood AM, Royston P, White IR. Correcting for Optimistic Prediction in Small Data Sets. Am J Epidemiol. 2014;180:318–24.

Article
PubMed
PubMed Central
Google Scholar

Steiner M, Kim Y. The Mechanics of omitted variable bias: bias amplification and cancellation of offsetting biases. J Causal Inference. 2016;4:20160009.

Article
PubMed
PubMed Central
Google Scholar

Sun GW, Shook TL, Kay GL. Inappropriate use of bivariable analysis to screen risk factors for use in multivariable analysis. J Clin Epidemiol. 1996;49:907–16.

Article
CAS
PubMed
Google Scholar

Taylor J, Tibshirani RJ. Statistical learning and selective inference. Proc Natl Acad Sci USA. 2015;112:7629–34.

Article
CAS
PubMed
PubMed Central
Google Scholar

Teräsvirta T, Mellin I. Model selection criteria and model selection tests in regression models. Scand J Stat. 1986;13:159–71.

Google Scholar

Tibshirani R. Regression shrinkage and selection via the Lasso. J Royal Stat Soc Series B Methodol. 1996;58:267–88.

Google Scholar

Tibshirani R. Regression shrinkage and selection via the lasso: a retrospective. J Royal Stat Soc Series B. 2011;73:273–82.

Article
Google Scholar

Tibshirani R, Taylor J, Loftus J, Reid S. Selective inference: tools for selective inference. Proc Natl Acad Sci USA. 2017;112:7629–34.

Google Scholar

Tutz G, Binder H. Generalized additive modelling with implicit variable selection by likelihood based boosting. Biometrics. 2016;62:961–71.

Article
Google Scholar

van Houwelingen HC. From model building to validation and back: a plea for robustness. Stat Med. 2014;33:5223–38.

Article
PubMed
Google Scholar

van Houwelingen HC, Sauerbrei W. Cross-validation, shrinkage and variable selection in linear regression revisited. Open J Stat. 2013;3:79–102.

Article
Google Scholar

van Houwelingen JC, le Cessie S. Predictive value of statistical models. Stat Med. 1990;9:1303–25.

Article
PubMed
Google Scholar

van Walraven C, Hart RG. Leave ‘em alone - why continuous variables should be analyzed as such. Neuroepidemiology. 2008;30:138–9.

Article
PubMed
Google Scholar

Vandenbroucke JP, von Elm E, Altman DG, Gotzsche PC, Mulrow CD, Pocock SJ, Poole C, Schlesselman JJ, Egger M. Strengthening the Reporting of Observational Studies in Epidemiology (STROBE): Explanation and Elaboration. Epidemiology. 2007;18:805–35.

Article
PubMed
Google Scholar

Vickers AJ, Lilja H. Cutpoints in clinical chemistry: time for fundamental reassessment. Clin Chem. 2009;55:15–7.

Article
CAS
PubMed
Google Scholar

White H. Using least squares to approximate unknown regression functions. Int Econ Rev. 1980a;21:149–70.

Article
Google Scholar

White HA. Heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity. Econometrica. 1980b;48:817–38.

Article
Google Scholar

Wikimedia Foundation Inc; 2019. Statistical model. URL https://en.wikipedia.org/wiki/State_of_the_art. Accessed 1 July 2019.

Winter C, Kristiansen G, Kersting S, Roy J, Aust D, Knösel T, Rümmele P, Jahnke B, Hentrich V, Rückert F, Niedergethmann M, Weichert W, Bahra M, Schlitt HJ, Settmacher U, Friess H, Büchler M, Saeger H-D, Schroeder M, Pilarsky C, Grützmann R. Google goes cancer: improving outcome prediction for cancer patients by network-based ranking of marker genes. PLOS Comput Biol. 2012;8:e1002511.

Article
CAS
PubMed
PubMed Central
Google Scholar

Wood S. Thin plate regression splines. J Royal Stat Soc Series B. 2003;65:95–114.

Article
Google Scholar

Wood S. Generalized additive models. New York: Chapman & Hall/CRC; 2006.

Book
Google Scholar

Wood S. Generalized additive models: an introduction with R. Second Edition: CRC Press; 2017.

Book
Google Scholar

Zou H, Hastie T. Regularization and variable selection via the elastic net. J Royal Stat Soc Series B (Methodological). 2005;67:301–20.

Article
Google Scholar

Zou H. The adaptive LASSO and its oracle properties. J Am Stat Assoc. 2006;101:1418–29.

Article
CAS
Google Scholar