BD - Earth day 2024

Identifying Prognostic Factors for Survival in Intensive Care Unit Patients With Sirs or Sepsis by Machine Learning Analysis on Electronic Health Records

Maximiliano Mollura, Davide Chicco, Alessia Paglialonga, Riccardo Barbieri


Systemic inflammatory response syndrome (SIRS) and sepsis are the most common causes of in-hospital death. However, the characteristics associated with the improvement in the patient conditions during the ICU stay were not fully elucidated for each population as well as the possible differences between the two.


Patient’s outcome has long been used as primary endpoint for trials in critical care as well as for determining the patient’s prognosis after treatments. Patient mortality and survival are indeed the major clinical outcomes, and they are main targets for assessing prognostic factors driving the patient conditions and the effectiveness of clinical interventions [1, 2], especially in the intensive care unit (ICU) where admitted patients are usually in very critical conditions and require constant monitoring and treatment.

Materials and method

In this retrospective study, we developed predictive models of patient survival using machine learning algorithms and we evaluated the importance of features associated with patient survival using machine learning and biostatistics approaches for both the SIRS and SEPSIS populations separately (Fig 1). All the analyses were performed with the Python 3.8.3 programming language, and scikit-learn 1.0 and SciPy 1.7.1 software packages. Observations with missing information (three patients) were removed.


Results of the statistical analysis are reported in Table 4. It can be observed that APACHE II and SOFA scores showed significant (p<0.0001) association with survival in both SIRS and SEPSIS cohorts. EoC resulted significantly associated (p<0.0001) with survival in SIRS cohort only.


Gucyetmez et al. [10] collected the data used in this study for exploring the ability of hemogram and CRP in discriminating between SIRS and SEPSIS cohorts. However, the authors did not investigate the prognostic role of the selected features within each cohort, therefore, our study aimed to investigate more in detail the importance of these features and the possible differences between the considered cohorts. Specifically, we performed the evaluation of the ability of hemogram parameters in predicting the survival of ICU patients diagnosed with SIRS or SEPSIS, using a set of parameters usually available in the patient clinical records. We used widely available features like patient sex, illness severity scores commonly measured and recorded at admission in the ICU, C-reactive protein, and blood cell count measurements. Patients’ comorbidities were not available in the patient’s records shared by Gucyetmez et al.


The proposed study applies an original machine learning paradigm for processing clinical information at admission in the ICU to predict patient survival. The proposed approach relies on a multi-variable predictive modeling approach based on information gathered at the ICU admission, and aimed at predicting the likelihood of patient survival for patients with SIRS and with SEPSIS. Results provide insights into the differences of the most relevant variables between the two groups. A Monte Carlo Cross-Validation procedure was further applied to have robust estimates of the obtained scores. The performed sensitivity analysis showed that results did not notably vary with hyperparameter tuning thus confirming the need for a larger cohort to advance to a fully calibrated deployable model.

Citation: Mollura M, Chicco D, Paglialonga A, Barbieri R (2024) Identifying prognostic factors for survival in intensive care unit patients with SIRS or sepsis by machine learning analysis on electronic health records. PLOS Digit Health 3(3): e0000459.

Editor: Nan Liu, Duke-NUS Medical School, SINGAPORE

Received: February 9, 2023; Accepted: February 5, 2024; Published: March 15, 2024

Copyright: © 2024 Mollura et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: The dataset is publically available at the following URL:

Funding: The work of DC was funded by the European Union – Next Generation EU program, in the context of The National Recovery and Resilience Plan, Investment Partenariato Esteso PE8 “Conseguenze e sfide dell’invecchiamento”, Project Age-It (Ageing Well in an Ageing Society) and partially supported by Ministero dell’Università e della Ricerca of Italy under the “Dipartimenti di Eccellenza 2023-2027” ReGAInS grant assigned to Dipartimento di Informatica Sistemistica e Comunicazione at Università di Milano-Bicocca. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.


The Healthcare Patient Experience & Engagement Summit 2024Healthcare Innovation & Transformation SummitHealthcare CNO SummitHealthcare CMO Summit