基于人体成分数据建立预测老年人衰弱的机器学习模型

Machine learning models for predicting frailty in the elderly based on body composition data

  • 摘要:
      背景  目前衰弱评估量表众多,多数指标难以客观定量。人体成分分析仪能快速获得与衰弱评估相关的定量数据,在大数据的挖掘与分析中机器学习具有一定的优势。
      目的  建立基于人体成分数据的机器学习模型,评价其诊断预测衰弱的价值。
      方法  2021年4 - 6月收集北京10个社区65岁以上老年人体检数据,以Fried衰弱表型量表作为衰弱诊断的金标准,筛选相关指标,建立随机森林、支持向量机、logistic回归和XGBoost模型,运用ROC曲线、敏感度和特异性等评价模型的预测效能。
      结果  共纳入558例数据进行建模分析,其中衰弱前期122例,非衰弱436例。随机森林算法筛出年龄、50 kHz全身相位角、骨骼肌质量、体脂百分比等10个重要性靠前的特征,并据此建立四个预测模型。Logistic回归模型的整体预测效能最高,ROC曲线下面积达到0.872,敏感度和特异性分别为78.38%和80.15%,预测准确率为79.76%。另外三种模型的整体效能差异不大,预测准确率均超过75%。
      结论  基于人体成分数据建立的logistic回归模型在预测老年人衰弱的效能上高于其他机器学习模型,且预测准确率较高,可用于衰弱的早期临床诊断。

     

    Abstract:
      Background  There are numerous frailty assessment scales, and most of the indicators are difficult to quantify objectively. The body composition analyzer can quickly obtain quantitative data related to frailty assessment, and machine learning has certain advantages in the mining and analysis of big data.
      Objective  To establish machine learning models based on body component data and evaluate its value in diagnosis and prediction of frailty.
      Methods  The physical examination data for the elderly over 65 years old in 10 Beijing communities were collected from April to June in 2021, and the Fried frailty phenotype scale was used as the gold standard for frailty diagnosis. Relevant indicators were screened , then random forest, support vector machine, logistic regression and XGBoost models were established to evaluate the predictive efficacy of the models using ROC curves, sensitivity and specificity.
      Results  A total of 558 cases were included for modeling analysis, including 122 pre-frailty cases and 436 non-frailty cases. The random forest algorithm screened important features such as age, 50kHz-whole body phase angle, skeletal muscle mass, and percent body fat, and four prediction models were built based on them. The logistic regression model had the highest overall predictive efficacy with an area under the ROC curve of 0.872, sensitivity and specificity of 78.38% and 80.15%, respectively, and a predictive accuracy of 79.76%. The overall effectiveness of the other three models did not differ significantly, with prediction accuracy exceeding 75%.
      Conclusion  The logistic regression model based on human body composition data is more effective than other machine learning models in predicting frailty in the elderly, with higher prediction accuracy, which can be used for early clinical diagnosis of frailty.

     

/

返回文章
返回