Machine Learning-Based Non-Communicable Disease Prediction Evaluating the Impact of Hypertension, Diabetes, and Lifestyle Factors on Stroke Risk

Matthew Darell Widodo (1), Pita Melati Sulkani (2), Ariyono Setiawan (3), Abdul Razak Bin Abdul Hadi (4)
(1) Universitas Kristen Petra Surabaya, Indonesia
(2) Universitas Airlangga Surabaya, Indonesia
(3) Politeknik Pelayaran Surabaya, Indonesia
(4) Universiti Kuala Lumpur, Malaysia
Fulltext View | Download
How to cite (IJASEIT) :
[1]
M. Darell Widodo, P. Melati Sulkani, A. Setiawan, and A. R. Bin Abdul Hadi, “Machine Learning-Based Non-Communicable Disease Prediction Evaluating the Impact of Hypertension, Diabetes, and Lifestyle Factors on Stroke Risk ”, Int. J. Data. Science., vol. 6, no. 2, pp. 113–126, Dec. 2025.

Chronic diseases such as diabetes, stroke, and heart disease are major challenges in the global health system. Data-driven risk prediction for this disease is important for supporting more precise and effective medical decisions. This study aims to evaluate the main factors contributing to the incidence of diabetes, stroke, and heart disease using logistic regression analysis. The data used are from health sources and includes demographic variables, lifestyle factors, and health indicators. Logistic regression was used to identify variables significantly associated with each health condition studied. The model was evaluated using p-value, regression coefficient, and confidence interval to assess the significance of risk factors. The results of the analysis showed that age, high blood pressure, cholesterol levels, and body mass index (BMI) contributed significantly to the risk of diabetes, stroke, and heart disease. Physical activity and alcohol consumption negatively affected the risk, while smoking factors did not show strong significance in the model. These findings confirm that certain lifestyle factors and health conditions significantly affect the risk of chronic disease. The implications of this research can inform data-driven prevention and early intervention strategies in the health sector.

M. A. Uddin, "Exploring the risk factors of diabetes in Dhaka City: Negative binomial regression and logistic regression approach," Saudi J. Med. Pharm. Sci., vol. 6, no. 12, pp. 753–758, Dec. 2020, doi:10.36348/sjmps.2020.v06i12.006.

N. W. K. Dharmapatni et al., "Analisis faktor yang mempengaruhi self awareness masyarakat terhadap faktor risiko penyakit ginjal kronik (PGK) di Bali," The Shine Cahaya Dunia Ners, vol. 9, no. 1, p. 13, Apr. 2024, doi:10.35720/tscners.v9i01.473.

M. K. Khan, "Bayesian statistical models for predicting type 2 diabetes prevalence in urban populations," Rev. Appl. Sci. Technol., vol. 4, no. 2, pp. 370–406, 2025, doi: 10.63125/db2e5054.

F. Z. R. Sugeha, T. Mahmudiono, and B. K. Rochmania, "Hubungan status gizi, pola makan, kebiasaan minum kopi dan tekanan darah pada mahasiswa Universitas Airlangga," Amerta Nutr., vol. 7, no. 2, pp. 267–273, Jun. 2023, doi: 10.20473/amnt.v7i2.2023.267-273.

D. A. Yunardi, M. Maiyastri, and H. Yozza, "Pemodelan penderita stroke dan diabetes melitus di Kota Padang dengan model regresi logistik biner bivariat," J. Mat. UNAND, vol. 9, no. 4, pp. 270–277, Feb. 2021, doi:10.25077/jmu.9.4.270-277.2020.

W. Nugraha, R. Sabaruddin, and S. Murni, "Teknik scaling menggunakan robust scaler untuk mengatasi outlier data pada model prediksi serangan jantung," Techno.Com, vol. 23, pp. 319–327, 2024

T. Taryadi, E. Yunianto, and K. Kasmari, "Diagnostik penyakit ginjal kronis menggunakan model klasifikasi support vector machine," IC Tech: Majalah Ilmiah, vol. 19, no. 1, pp. 39–44, 2024, doi: 10.47775/ictech.v19i1.291.

A. A. Qori'ah and Z. Fatah, "Implementasi prediksi penyakit ginjal kronis dengan menggunakan metode decision tree," JUSIFOR: J. Sist. Inform. Informat., vol. 3, no. 2, pp. 180–186, 2024

T. Husain, "Analysis of the successful implementation of integrated RFID systems (study on end-user's E-toll card on JORR toll road 2)," J. Syst. Inform. Manage., vol. 5, no. 2, pp. 124–133, 2020, doi: 10.32767/jusim.v5i02.921.

A. A. Tarimana, M. R. S. Fajar, M. A. Saktiawan, and R. A. Saputra, "Prediksi penyakit hipertensi menggunakan machine learning dengan algoritma regresi logistik," JATI (J. Mahasiswa Tek. Informat.), vol. 8, no. 6, pp. 12062–12068, Nov. 2024, doi: 10.36040/jati.v8i6.11793.

N. D. Ikakusumawati, S. A. Permatasari, and Y. Farida, "Faktor yang berhubungan dengan kualitas hidup pasien lansia dengan penyakit kronis," JFM (J. Farmasi Malahayati), vol. 7, no. 1, pp. 28–41, Jan. 2024, doi: 10.33024/jfm.v7i1.13491.

D. A. Yani, P. Sarnianto, and Y. Anggriani, "Risk factors of hemodialysis patients at Arjawinangun Hospital and Waled Hospital, Cirebon Regency," Syntax Literate: Indones. J. Soc. Sci., vol. 5, no. 1, pp. 71–84, 2020, doi:10.36418/syntax-literate.v5i1.857.

M. Pal and S. Parija, "Prediction of heart diseases using random forest," J. Phys.: Conf. Ser., vol. 1817, no. 1, p. 012009, Mar. 2021, doi: 10.1088/1742-6596/1817/1/012009.

C. N. Prabiantissa, L. N. Yamani, M. Hakimah, I. Puspitasari, and N. F. Rozi, "Implementation of artificial neural network (ANN) to construct model for stunting in toddlers," in Proc. IEEE Int. Conf. Artif. Intell. Mechatronics Syst. (AIMS), Bandung, Indonesia, 2024, pp. 1–5, doi: 10.1109/AIMS61812.2024.10513149.

L. Amaliana, U. Sa'adah, and N. W. Surya Wardhani, "Modeling tetanus neonatorum case using the regression of negative binomial and zero-inflated negative binomial," J. Phys.: Conf. Ser., vol. 943, p. 012051, Dec. 2017, doi: 10.1088/1742-6596/943/1/012051.

S. Poornima and M. Pushpalatha, "Prediction of rainfall using intensified LSTM based recurrent neural network with weighted linear units," Atmosphere, vol. 10, no. 11, p. 668, Oct. 2019, doi: 10.3390/atmos10110668.

E. A. Gultom, N. Eltivia, and N. I. Riwajanti, "Shares price forecasting using simple moving average method and web scraping," J. Appl. Bus., Taxation Econ. Res., vol. 2, no. 3, pp. 288–297, 2023, doi: 10.54408/jabter.v2i3.164.

M. Rahmizala, A. Rifa'ib, and R. Umarohd, "What affects individual happiness in Indonesia? Evidence from Indonesia family life survey," Iran. Econ. Rev., vol. 28, no. 4, pp. 1147–1175, 2024.

S. Purwantara et al., "Teaching the fundamentals of geography to Generation-Z students with collaborative learning in Indonesia," Geography Teacher, vol. 20, no. 1, pp. 29–34, 2023, doi: 10.1080/19338341.2023.2192749.

N. Trista and N. I. Sofianita, "Factors contributing to the blood pressure of high school students in Depok, West Java," Amerta Nutr., vol. 8, no. 1, pp. 1–10, 2024.

J. Chen et al., "Physical activity and eating behaviors patterns associated with high blood pressure among Chinese children and adolescents," BMC Public Health, vol. 23, p. 1516, 2023, doi: 10.1186/s12889-023-16331-1.

A. Z. Widniah and H. Putri, "Analisis faktor gaya hidup keluarga dengan kejadian hipertensi pada usia dewasa muda di Desa Sungai Paring wilayah kerja UPTD Puskesmas Jambu Hilir tahun 2023," J. Intan Nurs., vol. 2, no. 2, pp. 30–36, 2023.

W. Warjiman, Y. Warni, and A. Rachman, "Gaya hidup penderita hipertensi di Posyandu Lansia Desa Batu Makap di wilayah kerja UPT Puskesmas Tumbang Kunyi," J. Keperawatan Suaka Insan (JKSI), vol. 9, no. 1, pp. 30–34, 2024.

K. A. Putri, "Analysis of land cover classification results using ANN, SVM, and RF methods with R programming language (case research: Surabaya, Indonesia)," in Proc. IOP Conf. Ser.: Earth Environ. Sci., vol. 1127, no. 1, p. 012030, 2023.

D. Setiyadi et al., "Prediction of heart disease using random forest algorithm, support vector machine, and neural network," TELKOMNIKA (Telecommun. Comput. Electron. Control), vol. 23, no. 1, pp. 129–137, 2025.

W. N. Amira and N. H. Shafii, "Prediction of breast cancer disease using machine learning approach," in Proc. Res. Exhib. Math. Comput. Sci. (REMACS 5.0), College of Computing, Informatics and Media, UiTM Perlis, Malaysia, 2023, pp. 181–182.

M. A. Prisila, A. Islamiyati, and A. K. Jaya, "Model data kepemilikan asuransi kesehatan di Indonesia berdasarkan status pekerjaan melalui analisis regresi logistik biner dua level," Contemp. Math. Appl. (ConMathA), vol. 4, no. 2, pp. 125–133, 2022

A. R. Muhajir, E. Sutoyo, and I. Darmawan, "Forecasting model penyakit demam berdarah dengue di Provinsi DKI Jakarta menggunakan algoritma regresi linier untuk mengetahui kecenderungan nilai variabel prediktor terhadap peningkatan kasus," Fountain Informat. J., vol. 4, no. 2, pp. 33–40, 2019.

C. R. Madhuri, G. Anuradha, and M. V. Pujitha, "House price prediction using regression techniques: A comparative study," in Proc. Int. Conf. Smart Struct. Syst. (ICSSS), Chennai, India, 2019, pp. 1–5, doi:10.1109/ICSSS.2019.8882834.

M. R. Rosyid et al., "Implementation of quantum machine learning in predicting corrosion inhibition efficiency of expired drugs," Mater. Today Commun., vol. 40, p. 109830, Aug. 2024, doi: 10.1016/j.mtcomm.2024.109830.

N. Yamanie et al., "Prognostic model of in-hospital ischemic stroke mortality based on an electronic health record cohort in Indonesia," PLOS ONE, vol. 19, no. 6, p. e0305100, Jun. 2024, doi: 10.1371/journal.pone.0305100.

R. Ramadhan et al., "Epidemiological study of P. knowlesi in Aceh from 2018-2019," Sel: J. Penelitian Kesehatan, vol. 8, pp. 47–63, 2021.

M. R. Fhalepi, H. Setiawan, and N. Suhandi, "Decision support system for selecting smart Indonesia card candidates using preference selection index method," bit-Tech, vol. 8, no. 2, pp. 1712–1721, 2025.

A. Regita and I. Illahi, "The effect of investment decisions, funding decisions and dividend policies on company value," Implikasi: J. Manajemen Sumber Daya Manusia, vol. 1, no. 1, pp. 55–61, 2023.

Creative Commons License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Attribution-ShareAlike 4.0 International License
https://creativecommons.org/licenses/by-sa/4.0/