机器学习预测急诊急性非静脉曲张性上消化道出血患者早期不良预后的价值

Machine learning in predicting early adverse prognosis in emergency patients with acute nonvariceal upper gastrointestinal bleeding

  • 摘要: 背景 急性非静脉曲张性上消化道出血(acute nonvariceal upper gastrointestinal bleeding,ANVUGIB)是急诊常见急危重症之一,起病急、进展快,评估早期不良预后并开展针对性治疗,对于改善其临床结局至关重要。然而,目前临床常用风险评分系统并不能准确识别出高风险患者。目的 针对急诊ANVUGIB患者,构建一个综合性的严重不良预后评估系统,以涵盖早期再出血、死亡、外科手术或介入等多项结果预测,实现高风险患者的便捷、准确预测。方法 选择2023 年12 月— 2025 年5 月解放军总医院第一医学中心急诊收治的471 例ANVUGIB患者,以生命体征、化验结果、合并疾病及基础疾病等临床资料为预测变量;按预后(入院7 d 内是否发生死亡、再出血、急诊外科或(和)介入干预、实施抢救及转入ICU)不同分为良好预后组(n=337)与不良预后组(n=134)。通过单因素、多因素Logistic 回归分析(“向前:条件”逐步分析法)并结合临床专家经验与共线性分析筛选关键变量,构建极限梯度提升(extreme gradient boosting,XGBoost)、随机森林(random forest,RF)、梯度提升机(gradient boosting machine,GBM)、逻辑回归(logistic regression,LR)、支持向量机(support vector machine,SVM)5 种机器学习模型,并使用SHAP 算法对最优机器学习模型进行解释,同时与内镜检查前Rockall 评分(pre-endoscopic Rockall score,PRS)、格拉斯哥-布拉奇福德出血评分(Glasgow Brachford score,GBS)以及AIMS-65 等临床常用风险评分系统进行对比分析(Delong 检验)。结果 471 例患者中男性325 例(69.1%),女性146 例(30.9%),患者年龄中位数(四分位数)为66(57,74)岁,范围18 ~ 95 岁。从49 项备选变量中最终筛选10 项关键变量用以构建5 种机器学习模型,其中XGBoost 性能表现最佳,受试者工作特征(receiver operating characteristic,ROC)曲线下面积(area under the curve,AUC)、准确率、精确度、召回率、F1 值和Brier 分数分别为0.904,0.842,0.804,0.834,0.818,0.122。SHAP 分析显示,在XGBoost 模型中发挥关键作用的前三项变量为格拉斯哥昏迷评分法、合并上消化道肿瘤、尿素。与PRS、GBS以及AIMS-65 等临床常用风险评分系统(AUC分别为0.823、0.791、0.631)相比,XGBoost 模型预测能力显著优于PRS(P<0.001),并表现出优于GBS的趋势(校准前P=0.028,Bonferroni 校正后P=0.165),与AIMS-65 相比两者差异无统计学意义(P=0.322)。在准确率、精确度、召回率及F1 分数等关键参数上,XGBoost模型也优于AIMS-65、GBS及PRS等临床常用风险评分系统。结论 本研究证实,基于生命体征、化验结果、合并疾病及基础疾病等常用临床指标的XGBoost 机器学习模型,可作为预测本中心急诊ANVUGIB患者早期多项不良预后的有效工具,将有望在ANVUGIB患者治疗的关键窗口期为临床决策提供有力辅助,从而提升急诊整体救治水平,并有效减少医疗资源消耗。

     

    Abstract: Background Acute nonvariceal upper gastrointestinal bleeding (ANVUGIB) is a common critical condition in emergency medicine, characterized by rapid onset and progression. Early assessment of poor prognosis and targeted treatment are essential for improving clinical outcomes. However, current clinical risk scoring systems often fail to accurately identify high-risk patients. Objective To construct a comprehensive severe adverse outcome prediction system for emergency patients with ANVUGIB, covering predictions of early rebleeding, mortality, surgical or interventional treatment, and other outcomes, so as to enable convenient and accurate identification of high-risk patients. Methods A total of 471 patients with ANVUGIB admitted to the Emergency Department of the First Medical Center of PLA General Hospital from December 2023 to May 2025 were enrolled. Clinical data, including vital signs, laboratory results, comorbidities, and underlying diseases, were selected as predictive variables. Based on outcomes within 7 days of admission (including death, rebleeding, emergency surgery or intervention, resuscitation, and transfer to the ICU), patients were divided into good prognosis group (n=337) and poor prognosis group (n=134). Key variables were screened using univariate and multivariate logistic regression (forward: conditional), combined with clinical expert experience and collinearity analysis. Five machine learning models were constructed, including extreme gradient boosting (XGBoost), random forest (RF), gradient boosting machine (GBM), logistic regression (LR), and support vector machine (SVM). The SHAP algorithm was used to interpret the optimal model. Furthermore, the models were compared with three commonly used clinical risk scoring systems—Pre-endoscopic Rockall score (PRS), Glasgow Blatchford score (GBS), and AIMS65—using the DeLong test. Results  Among the 471 patients, 325(69.1%) were male and 146(30.9%) were female. The median age was 66 (57, 74) years, ranging from 18 to 95 years. Ten key variables were selected from 49 candidates to construct five machine learning models. Among these, XGBoost showed the best performance. Its area under the receiver operating characteristic curve (AUC), accuracy, precision, recall, F1 score, and Brier score were 0.904, 0.842, 0.804, 0.834, 0.818, and 0.122, respectively. SHAP analysis indicated that the top three contributing variables in the XGBoost model were Glasgow Coma Scale, co-existing upper gastrointestinal tumors, and urea. Compared to commonly used clinical risk scoring systems such as PRS, GBS, and AIMS-65 (with AUCs of 0.823, 0.791, and 0.631, respectively), the XGBoost model demonstrated significantly superior predictive ability to PRS (P<0.001). It showed a trend of outperforming GBS (unadjusted P=0.028, Bonferroni adjusted P=0.165), while no significant difference was observed compared to AIMS-65 (P=0.322). Furthermore, regarding key parameters like accuracy, precision, recall, and F1-score, XGBoost outperformed these clinical scoring systems including AIMS-65, GBS, and PRS. Conclusion This study confirms that the XGBoost machine learning model, based on common clinical indicators such as vital signs, laboratory results, comorbidities, and underlying diseases, is an effective tool for predicting early multiple adverse outcomes in emergency ANVUGIB patients at our center. It is expected to provide strong support for clinical decision-making during the critical treatment window for ANVUGIB patients, thereby improving overall emergency care levels and effectively reducing medical resource consumption.

     

/

返回文章
返回