Abstract
Background Acute nonvariceal upper gastrointestinal bleeding (ANVUGIB) is a common critical condition in emergency medicine, characterized by rapid onset and progression. Early assessment of poor prognosis and targeted treatment are essential for improving clinical outcomes. However, current clinical risk scoring systems often fail to accurately identify high-risk patients. Objective To construct a comprehensive severe adverse outcome prediction system for emergency patients with ANVUGIB, covering predictions of early rebleeding, mortality, surgical or interventional treatment, and other outcomes, so as to enable convenient and accurate identification of high-risk patients. Methods A total of 471 patients with ANVUGIB admitted to the Emergency Department of the First Medical Center of PLA General Hospital from December 2023 to May 2025 were enrolled. Clinical data, including vital signs, laboratory results, comorbidities, and underlying diseases, were selected as predictive variables. Based on outcomes within 7 days of admission (including death, rebleeding, emergency surgery or intervention, resuscitation, and transfer to the ICU), patients were divided into good prognosis group (n=337) and poor prognosis group (n=134). Key variables were screened using univariate and multivariate logistic regression (forward: conditional), combined with clinical expert experience and collinearity analysis. Five machine learning models were constructed, including extreme gradient boosting (XGBoost), random forest (RF), gradient boosting machine (GBM), logistic regression (LR), and support vector machine (SVM). The SHAP algorithm was used to interpret the optimal model. Furthermore, the models were compared with three commonly used clinical risk scoring systems—Pre-endoscopic Rockall score (PRS), Glasgow Blatchford score (GBS), and AIMS65—using the DeLong test. Results Among the 471 patients, 325(69.1%) were male and 146(30.9%) were female. The median age was 66 (57, 74) years, ranging from 18 to 95 years. Ten key variables were selected from 49 candidates to construct five machine learning models. Among these, XGBoost showed the best performance. Its area under the receiver operating characteristic curve (AUC), accuracy, precision, recall, F1 score, and Brier score were 0.904, 0.842, 0.804, 0.834, 0.818, and 0.122, respectively. SHAP analysis indicated that the top three contributing variables in the XGBoost model were Glasgow Coma Scale, co-existing upper gastrointestinal tumors, and urea. Compared to commonly used clinical risk scoring systems such as PRS, GBS, and AIMS-65 (with AUCs of 0.823, 0.791, and 0.631, respectively), the XGBoost model demonstrated significantly superior predictive ability to PRS (P<0.001). It showed a trend of outperforming GBS (unadjusted P=0.028, Bonferroni adjusted P=0.165), while no significant difference was observed compared to AIMS-65 (P=0.322). Furthermore, regarding key parameters like accuracy, precision, recall, and F1-score, XGBoost outperformed these clinical scoring systems including AIMS-65, GBS, and PRS. Conclusion This study confirms that the XGBoost machine learning model, based on common clinical indicators such as vital signs, laboratory results, comorbidities, and underlying diseases, is an effective tool for predicting early multiple adverse outcomes in emergency ANVUGIB patients at our center. It is expected to provide strong support for clinical decision-making during the critical treatment window for ANVUGIB patients, thereby improving overall emergency care levels and effectively reducing medical resource consumption.