Alternative expectation approaches for expectation-maximization missing data imputations in cox regression


Creative Commons License

Sağlam F., Sanli T., Cengiz M. A., Terzi Y.

COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2021 (SCI-Expanded) identifier identifier

  • Publication Type: Article / Article
  • Publication Date: 2021
  • Doi Number: 10.1080/03610918.2021.2024851
  • Journal Name: COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, Applied Science & Technology Source, Business Source Elite, Business Source Premier, CAB Abstracts, Compendex, Computer & Applied Sciences, Veterinary Science Database, zbMATH, Civil Engineering Abstracts
  • Keywords: Cox regression, Expectation-maximization, Machine learning, Missing data, PROPORTIONAL HAZARDS MODEL, INCOMPLETE DATA, LIKELIHOOD, EQUATIONS
  • Ondokuz Mayıs University Affiliated: Yes

Abstract

Missing data is common in survival analysis. It is either removed or imputed using various methods. Expectation-maximization (EM) imputation is a popular method in Cox regression studies. This paper investigated the effect of different regression methods on Cox regression modeling within the framework of EM. A stratified Cox regression model was derived from a dataset of categorical and numerical variables. Missing data were imputed using the EM framework with five machine learning algorithms and then were compared to the full model. The results show that the recursive partition and regression tree (RPART) method performed better than others. However, all regression methods performed poorly in categorical covariate imputation. R code is available online.