Diagnostic value of machine learning models for depth of invasion in esophageal cancer: A comparison based on radiomic features and deep learning features
-
-
Abstract
Background Esophageal cancer is one of the leading causes of cancer-related deaths worldwide, and the choice of treatment strategy heavily depends on accurate preoperative staging of the depth of tumor invasion. Precisely distinguishing whether the tumor is confined to the mucosa (stages Tis-T1a) or has invaded beyond the submucosa (stages T1b-T3) is crucial for deciding between endoscopic minimally invasive treatment and surgical resection. Objective To develop and validate an integrated model based on CT radiomics and machine learning for the preoperative non-invasive discrimination between stage Tis-T1a and stage T1b- T3 lesions in esophageal cancer. Methods Clinical and imaging data were collected from patients with pathologically confirmed esophageal cancer treated at the First Medical Center of PLA General Hospital from January 2018 to June 2025. Using Python 3.9, radiomic features were extracted from the preoperative CT images of each patient to form a radiomic feature set. Concurrently, deep learning features were extracted from the CT images using a 3D-ResNet architecture to establish a deep learning feature set. Least Absolute Shrinkage and Selection Operator (LASSO) regression was applied to select features from both sets. Using the selected features from the two sets as analytical variables and the entire cohort as the training set, six machine learning algorithms—Support Vector Machine (SVM), Random Forest (RF), Stochastic Gradient Descent (SGD), k-Nearest Neighbors (KNN), eXtreme Gradient Boosting (XGBoost), and Light Gradient Boosting Machine (LightGBM) —were employed to develop three types of models: a radiomics model (Rad), a deep learning-based model (DL), and a combined model (Rad+DL) for diagnosing Tis-T1a versus T1b-T3 lesions. Internal validation was performed using the Bootstrap method with 1 000 resampling iterations. The diagnostic performance of the constructed models was evaluated using receiver operating characteristic (ROC) curve analysis. Calibration was assessed via calibration curves, and clinical utility was quantified by net benefit across different threshold probabilities using decision curve analysis (DCA). Results A total of 340 patients with esophageal cancer were included in the dataset, comprising 292 males and 48 females, with a mean age of 61.97 ± 7.58 years. Among them, 68 patients were staged as Tis-T1a and 272 as T1b-T3. From an initial pool of 1 130 radiomic features and 512 deep learning features, LASSO regression selected 9 radiomic features and 11 deep learning features, respectively, for model construction. Among the 6 machine learning algorithms, RF demonstrated the best overall performance. Among the three models constructed using RF, the Rad+DL model exhibited superior diagnostic performance compared to the Rad and DL models alone. The areas under the ROC curve (AUC) for the three models in the training set were 0.877, 0.816, and 0.799, respectively; in the validation set, the corresponding AUC values were 0.794, 0.785, and 0.678. Decision curve analysis further confirmed that the Rad+DL model provided significant clinical net benefit across a wide range of threshold probabilities, substantially outperforming both the Rad and DL models. Calibration curve analysis demonstrated good agreement between the diagnostic accuracy of the Rad+DL model and the actual observed outcomes. Conclusion The combined model successfully developed in this study can effectively utilize CT imaging data for the preoperative non-invasive discrimination of the depth of tumor invasion in esophageal cancer (Tis-T1a vs T1b-T3), demonstrating high accuracy. The model provides valuable imaging-based reference for clinicians to formulate individualized treatment plans, especially when choosing between endoscopic or surgical approaches.
-
-