Background Acute respiratory distress syndrome (ARDS) is a disease with high morbidity and accounts for 10% of ICU admissions, with clinical features usually presenting within 6-72 hours of the pathogenesis and rapidly worsening. The mortality rate is also high and increases with the severity of the disease.
Objective To establish a convenient, noninvasive early prediction model for severe ARDS.
Methods The eICU Collaborative Research Database created by MIT and Philips was used to retrieve data on three vital signs (respiratory rate, temperature, and heart rate) and oxygenation index (PaO2/FiO2) of patients diagnosed with ARDS, and PaO2/FiO2≤100 mmHg was considered as severe ARDS. 96 h was used as a time window, and logistic regression, random forest and LightGBM were applied to establish a prediction model to analyze vital sign data from 6-96 h, 6-48 h and 6-24 h before diagnosis to predict whether severe ARDS would occur. Model performance was evaluated by oob score, cross-validation and calibration curve, and also ARDS patients from Respiratory Intensive Care Unit of Chinese PLA General Hospital were selected to validate the models independently.
Results A total of 232 patients were retrieved from the eICU database with 3 140 oxygenation index measurements during hospitalization, including 1 042 with PaO2/FiO2 ≤100 mmHg. The 6-96 h, 6-48 h, and 6-24 h vital sign data were respectively used to build 9 prediction models by using logistic regression, random forest, and LightGBM. Comparing different time windows, the highest prediction accuracy and AUC were obtained for 6-96 h; the best diagnostic performance was obtained for the random forest model compared among different models; the accuracy of the random forest model for 6-96 h was 0.833 and the AUC was 0.885; the AUCs for the 6-48 h and 6-24 h time windows were 0.815 and 0.806, respectively; the AUC of LightGBM, and logistic regression models of 6-96 h time window was 0.868 and 0.634, respectively. Each model was validated in ARDS patients in Chinese PLA General Hospital, and the random forest model with 6-96 h time window had the best prediction performance with an accuracy of 0.834 and AUC of 0.843.
Conclusion The ARDS early prediciton model based on random forest has good predictive ability. It can warn the occurrence of severe ARDS through non-invasive and three easy-to-obtain physical indicators of heart rate, body temperature and respiratory rate, and help medical staff to make earlier intervention and treatment, relieve the pressure of inadequate medical resources, and improve the success rate of treatment.