基于大数据建模的冠心病发病风险指标评估

王逸飞; 康季槐; 应俊; 杨俊杰; 陈康

doi:10.3969/j.issn.2095-5227.2019.08.005

基于大数据建模的冠心病发病风险指标评估

Risk assessment of coronary heart disease based on big data modeling

摘要

摘要:
目的基于大样本流行病学调查数据量化评价冠心病的发病风险，筛选风险指标。
方法收集2015年解放军总医院开展的社区慢性疾病流行病学调查资料19 021例，包括个人信息及生活习惯、病史及家族史、检验指标和心电图检查指标，剔除完整度不足70%的样本，使用步进式K-最近邻法进行缺失值填补，选用Adaboost算法进行风险评估，并采用10折交叉法进行模型验证。
结果年龄、高血压病程、血脂异常、其他共病、糖尿病病程和低密度脂蛋白胆固醇是评估冠心病发病风险的重要指标；模型对冠心病发病风险评估的召回率、准确率、AUC与F1值分别为0.727、0.741、0.796与0.796。
结论本研究建立的模型可为预测个体冠心病患病风险提供参考。

Abstract:
Objective To quantitatively evaluate the risk of coronary heart disease based on large scale epidemiological surveillance data.
Methods Epidemiological data, including demographic information and living habits, medical and family history, testing indicators and electrocardiogram indicators, were collected from 19 021 cases with chronic disease by community survey that was conducted by Chinese PLA General Hospital in 2015. Samples with less than 70% data completeness were eliminated. Stepped K-Nearest Neighbor method was used to fill the missing value, Adaboost algorithm was used to assess the risk of coronary heart disease, and 10-fold crossover method was applied for model validation.
Results Age, duration of hypertension, dyslipidemia, presence of other comorbidities, duration of diabetes and low-density lipoprotein cholesterol were important indicators for evaluating the incidence of coronary heart disease. The recall rate, accuracy, AUC and F1 values of the model for evaluating the risk of coronary heart disease were 0.727, 0.741, 0.796 and 0.796, respectively.
Conclusion Our model can provide personalized prediction of the risk of coronary heart disease.

HTML全文

参考文献(24)

施引文献

资源附件(0)