A Web-Based Diabetes Prediction Application Using XGBoost Algorithm
DOI:
https://doi.org/10.32734/jocai.v5.i2-6290Keywords:
Diabetes, XGBoost Algorithm, Wesbite, Data Mining, Knowledge Discovery in Database (KDD)Abstract
One of the diseases that is generally characterized by symptoms of an increase in glucose levels in the blood and is one of the body diseases classified as chronic is diabetes. Diabetes suffered by a person from time to time can cause serious damage to other organs such as blood vessels, kidneys, heart and nerves. Machine learning provides various data mining algorithms that can be used to assist medical experts. The accuracy of machine learning algorithms is a measure of the effectiveness of decision support systems. Prediction of diabetes can be seen from the patient's medical record data, therefore the author wants to create a diabetes prediction system independently through a website-based application system. This application system will be combined with data observation, namely the science of data mining using the XGBoost algorithm. The dataset is divided into training data by 80% and testing data by 20%. Before the data modeling was carried out, we carried out various parameter setting scenarios with the hope of evaluating and evaluating the implementation to be applied, the parameters we adjusted were colsample_bytree, gamma, learning_rate, max_depth, n_estimators, reg_alpha, reg_lambda, and subsample. After sharing the data and tuning parameters, the resulting model by applying the XGBoost algorithm has an accuracy of 74.67%, the resulting precision value is 57.40%, the resulting recall value is 65.94%, the resulting specificity value is 78, 50%.
Downloads
References
World Health Organization (WHO), "Diabetes," [Online]. Available: https://www.who.int/health-topics/diabetes. [Accessed 8 May 2021].
UCI Machine Learning, "Kaggle: Pima Indians Diabetes Database," 7 October 2016. [Online]. Available: https://www.kaggle.com/uciml/pima-indians-diabetes-database. [Accessed 1 May 2021].
M. B. Hanif and Khoirudin, "SISTEM APLIKASI PREDIKSI PENYAKIT DIABETES MENGGUNAKAN FITURE SELECTION KORELASI PEARSON DAN KLASIFIKASI NAÏVE BAYES," Pengembangan Rekayasa dan Teknologi, vol. 16, no. 2, pp. 199-205, 2020.
T. Chen and C. Guestrin, "XGBoost: A Scalable Tree Boosting System," Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, vol. 13, pp. 785-794, 2016.
G. Abdurrahman and M. Sintawati, "Implementation of xgboost for classification of parkinson's disease," Journal of Physics: Conference Series, vol. 1538, no. 1, 2020.
A. Handayani, A. Jamal and A. A. Septiandri, "Evaluasi Tiga Jenis Algoritme Berbasis Pembelajaran Mesin untuk Klasifikasi Jenis Tumor Payudara," vol. 6, no. 4, pp. 394-403, 2017.
A. Ogunleye and Q.-G. Wang, "XGBoost Model for Chronic Kidney Disease Diagnosis," IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 17, no. 6, pp. 2131-2140, 2020.
A. Handayani, A. Jamal and A. A. Septiandri, "Evaluasi Tiga Jenis Algoritme Berbasis Pembelajaran Mesin untuk Klasifikasi Jenis Tumor Payudara," vol. 6, no. 4, pp. 394-403, 2017.
L. Zhang and C. Zhan, "Machine Learning in Rock Facies Classification: An Application of XGBoost".
S. K. Dey, A. Hossain and M. Rahman, "Implementation of a Web Application to Predict Diabetes Disease: An Approach Using Machine Learning Algorithm," 2018 21st International Conference of Computer and Information Technology, ICCIT 2018, pp. 1-5, 2019.
A. Mir and S. N. Dhage, "Diabetes Disease Prediction using Machine Learning on Big Data of Healthcare," Proceedings - 2018 4th International Conference on Computing, Communication Control and Automation, ICCUBEA 2018, pp. 1-6, 2018.
S. S. Dhaliwal, A.-A. Nahid and R. Abbas, "Effective Intrusion Detection System Using XGBoost," Information, vol. 9, no. 7, 2018.
Maniruzzaman, N. Kumar, M. Abedin, S. Islam, H. S. Suri, A. S. El-Baz and J. S. Suri, "Comparative approaches for classification of diabetes mellitus data: Machine learning paradigm," Computer Methods and Programs in Biomedicine, vol. 152, pp. 23-34, 2017.
Published
How to Cite
Issue
Section
Copyright (c) 2021 Data Science: Journal of Computing and Applied Informatics
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
The Authors submitting a manuscript do so on the understanding that if accepted for publication, copyright of the article shall be assigned to Data Science: Journal of Informatics Technology and Computer Science (JoCAI) and Faculty of Computer Science and Information Technology as well as TALENTA Publisher Universitas Sumatera Utara as publisher of the journal.
Copyright encompasses exclusive rights to reproduce and deliver the article in all form and media. The reproduction of any part of this journal, its storage in databases and its transmission by any form or media, will be allowed only with a written permission fromData Science: Journal of Informatics Technology and Computer Science (JoCAI).
The Copyright Transfer Form can be downloaded here.
The copyright form should be signed originally and sent to the Editorial Office in the form of original mail or scanned document.