重症肺炎短期死亡风险预测模型的构建与验证

胡海明; 赵立娟; 李米锶; 周鸽儿; 乔一珊; 常德

doi:10.12435/j.issn.2095-5227.26010704

摘要: 背景　重症肺炎患者病情危重，病死率高，对该群体进行早期干预能显著改善其预后。目的　探索重症监护室中肺炎患者的死亡危险因素，构建短期死亡(7 d 死亡)风险预测模型并进行临床应用。方法　基于美国MIMIC-Ⅳ数据库，收集2008 — 2022 年4 247 例重症肺炎患者相关临床信息，按7.5∶2.5 的比例随机划分为训练集(n=3 185)和验证集(n=1 062)。在训练集和验证集中基于LASSO回归筛选的特征构建5 种机器学习模型，包括Logistic 回归、弹性网络、随机森林、极限梯度提升和轻量梯度提升机。结合受试者工作特征(receiver operator characteristic，ROC)曲线、校准曲线、临床决策(decision curve analysis，DCA)曲线、灵敏度、特异度及阴性预测值(negative predictive value，NPV)等指标综合评估筛选最佳模型，并与序贯器官衰竭评分(sequential organ failure assessment，SOFA)进行比较。基于最终模型制作列线图增强模型解释性，并构建在线网页计算器以增加临床实用性。结果　模型最终纳入9 个关键特征：pH、年龄、血尿素氮、血氧饱和度、乳酸、部分凝血活酶时间(partial thromboplastin time，PTT)、抗凝治疗、感染类型和抗病原体治疗。其中，logistic 模型在验证集中综合性能表现最佳(AUC=0.82，95% CI：0.77 ~ 0.89；NPV=0.99，95% CI：0.97 ~ 0.99)，校准曲线和DCA曲线证明该模型有良好的稳定性和临床适用性。Logistic 模型的Brier 评分为0.042，校准斜率为0.944，截距为0.027，Hosmer-Lemeshow 拟合优度检验提示拟合良好(P=0.768)，且预测性能显著优于SOFA评分(DeLong 检验，P＜0.05)。结论　本研究构建并验证了一个基于机器学习的重症肺炎短期死亡风险的临床预测模型。该模型具有较高的预测效能与临床实用性，可以辅助临床早期决策和风险评估。

Abstract: Background　Patients with severe pneumonia are in a critical condition and have a high mortality rate; early intervention in this population can significantly improve their prognosis. Objective　To identify risk factors for mortality in patients with pneumonia in intensive care units, develop a predictive model for short-term mortality (death within 7 days), and apply this model in clinical practice. Methods　Clinical data about 4 247 critically ill pneumonia patients were collected from the MIMIC-Ⅳ database (2008 — 2022). Data were randomly divided into a training set (n=3 185) and a validation set (n=1 062) in a 7.5:2.5 ratio. Five machine learning models were constructed using features selected via LASSO regression: logistic regression, Elastic Net (Enet), Random Forest (RF), extreme gradient boosting (XGBoost), and light gradient boosting machine (lightGBM). The optimal model was comprehensively evaluated using receiver operating characteristic (ROC) curves, calibration curves, decision curve analysis (DCA), sensitivity, specificity, and negative predictive value (NPV), with comparative analysis against the SOFA score. Model interpretability was enhanced through a nomogram for individualized risk visualization, and an online web calculator was developed to improve clinical utility. Results　The final model incorporated nine key features: pH, age, blood urea nitrogen, oxygen saturation, lactate, partial thromboplastin time (PTT), anticoagulant therapy, type of infection, and anti-pathogen treatment. Among these, the Logistic model demonstrated the best overall performance in the validation set (AUC=0.82, 95% CI: 0.77 - 0.89; NPV= 0.99, 95% CI: 0.97 - 0.99), with the calibration curve and DCA curve confirming the model's good stability and clinical applicability. The Logistic model had a Brier score of 0.042, a calibration slope of 0.944, and an intercept of 0.027; the Hosmer-Lemeshow goodness-of-fit test indicated a good fit (P=0.768), and its predictive performance was significantly superior to that of the SOFA score (DeLong test, P＜0.05). The study further deployed the final model as an online web-based calculator for the rapid assessment of short-term mortality risk in patients with severe pneumonia. Conclusion　This study has constructed and validated a machine learning-based clinical predictive model for short-term mortality risk in severe pneumonia. The model demonstrates high predictive efficacy and clinical utility, supporting early clinical decision-making and risk assessment.

重症肺炎短期死亡风险预测模型的构建与验证

Development and validation of a short-term mortality risk prediction model for severe pneumonia