Detailed Abstract
[E-poster]
[EP103] Comparison of performance between machine learning technique and conventional logistic regression analysis in terms of risk prediction for malignancy of the intraductal papillary mucinous neoplasm of pancreas – A multi-national multi-institutional retrospective cohort study
Jae Seung KANG1, Chanhee LEE2, Youngmin HAN1, Yoo Jin CHOI1, Yoonhyeong BYUN1, Hongbeom KIM1, Wooil KWON1, Taesung PARK3, Jin-Young JANG*1
1Surgery and Cancer Research Institute, Seoul National University College of Medicine, Korea
2Interdisciplinary Program in Bioinformatics, Seoul National University, Korea
3Statistics, Seoul National University, Korea
Introduction : Most nomograms predicting malignant intraductal papillary mucinous neoplasm (IPMN) of pancreas were developed based on the logistic regression (LR) analysis. This study was to develop a prediction model using machine learning (ML) and compare the performances between ML and LR model.
Methods : This was a multi-national, multi-institutional, retrospective study. Malignant IPMNs were defined as those with high grade dysplasia and associated invasive carcinoma. Auto ML technique was utilized in R program. Six algorithms of ML (XG boost, deep learning, distributed random forest, generalized linear mode, gradient boosting machine, stacked ensemble [SE]) were utilized and compared. The algorithm which had the best performance was selected, and the performances of ML algorithm and LR model were compared.
Results : The total of 3,096 patients were enrolled. The patients were divided into model development set and external validation set with ratio of 2:1. In a multivariate LR, age, sex, main duct diameter, cyst size, mural nodule, and tumor location were independent risk factors for malignant IPMN. LR model consisted of these factors. Of the six algorithms, SE had the highest area under the receiver operating curve (AUC) in the internal validation (AUC, 0.742). The performances were comparable between ML and LR models in the external validation (AUC, 0.725 vs. 0.721).
Conclusions : The performance of LR model was comparable to that of ML. The LR model would be more practical because of its clinical convenience.
Methods : This was a multi-national, multi-institutional, retrospective study. Malignant IPMNs were defined as those with high grade dysplasia and associated invasive carcinoma. Auto ML technique was utilized in R program. Six algorithms of ML (XG boost, deep learning, distributed random forest, generalized linear mode, gradient boosting machine, stacked ensemble [SE]) were utilized and compared. The algorithm which had the best performance was selected, and the performances of ML algorithm and LR model were compared.
Results : The total of 3,096 patients were enrolled. The patients were divided into model development set and external validation set with ratio of 2:1. In a multivariate LR, age, sex, main duct diameter, cyst size, mural nodule, and tumor location were independent risk factors for malignant IPMN. LR model consisted of these factors. Of the six algorithms, SE had the highest area under the receiver operating curve (AUC) in the internal validation (AUC, 0.742). The performances were comparable between ML and LR models in the external validation (AUC, 0.725 vs. 0.721).
Conclusions : The performance of LR model was comparable to that of ML. The LR model would be more practical because of its clinical convenience.
SESSION
E-poster
E-Session 7/27 ~ 7/29 ALL DAY