Algorithm Analysis Comparison of Naïve Bayes and Logistic Regression Methods for Predicting Diabetes
Keywords:
Naïve Bayes, Logistic regression, Data mining, Diabetes, Health information systemsAbstract
Diabetes is caused by increased blood glucose (or blood sugar) levels in the body's metabolism, which causes severe damage to the heart, blood vessels, eyes, kidneys and nerves over time. Diabetes is a disease that cannot be cured but can be overcome by changing healthier lifestyle habits. The factor that causes the increase in diabetes is the delay in the early diagnosis of diabetes. Patients often die before receiving a diagnosis of diabetes due to complications from the condition. There are many types of variables and circumstances that can delay the diagnosis of diabetes. One of the efforts made in the early detection of diabetes is to utilize machine learning to assist in a fast and accurate diagnosis by modelling the calcifications of diabetes. This study uses a training set data analysis model by comparing the two modelling methods between Logistic Regression and Naïve Bayes. In this study, the Naïve Bayes algorithm got an accuracy of 89.5%. The ROC-AUC value is 0.895, and the precision value is 88 True Positive, 11 True Negative, 10 False Negative and 91 False Positive. So it can be concluded that the precision for category 0 is 0.89 and category 1 is 0.90. Based on model making, both models have the same fit speed. Therefore, it can be concluded in the case of diabetes classification, the model that gets the best value is Logistic Regression.