Explainable Machine Learning Framework for Thyroid Cancer Recurrence Prediction
DOI:
https://doi.org/10.47709/cnahpc.v8i3.8894Keywords:
clinicopathological features, recurrence prediction, shapley additive explanations (shap), thyroid cancer, xgboostAbstract
Accurate prediction of thyroid cancer recurrence is essential for improving long-term patient management and supporting evidence-based clinical decision-making. Although machine learning has demonstrated promising predictive performance, limited model interpretability remains a major barrier to its clinical adoption. This study aims to develop an Explainable Machine Learning framework for thyroid cancer recurrence prediction by integrating Extreme Gradient Boosting (XGBoost) with SHapley Additive exPlanations (SHAP) using clinicopathological features. A publicly available dataset containing 383 patient records was preprocessed through label encoding, correlation analysis, Chi-Square-based feature selection, and Min-Max normalization. Logistic Regression, Decision Tree, Random Forest, and XGBoost were comparatively evaluated using 10-fold stratified cross-validation with Accuracy, Precision, Recall, F1-score, and ROC-AUC as evaluation metrics. The best-performing model was subsequently interpreted using global and local SHAP analyses. XGBoost achieved the highest performance, with an accuracy of 95.8% ± 4.4%, precision of 93.4% ± 8.3%, recall of 91.4% ± 9.9%, F1-score of 92.2% ± 8.3%, and ROC-AUC of 98.6% ± 2.5%, outperforming the other models. SHAP analysis identified Response, Risk, and N Stage as the most influential clinicopathological factors affecting recurrence prediction. This study contributes by developing a unified Explainable Machine Learning framework that integrates comparative model evaluation, XGBoost prediction, and global and local SHAP interpretation within a single workflow. The proposed framework provides accurate and clinically interpretable recurrence prediction, supporting trustworthy risk assessment and personalized decision-making in thyroid cancer management.
Downloads
References
Abkar, A., Mehrabi, M., Golabpour, A., & Shayegan, M. A. (2026). Designing an explainable algorithm based on XGBoost and genetic algorithm for predicting hospitalization needs of COVID-19 patients. Scientific Reports 2026 16:1, 16(1), 10210-. https://doi.org/10.1038/s41598-026-40120-6
Alawiyah, T. A., Wibisono, T., & Mulyani, Y. S. (2024). The Prediction of Thyroid Cancer Recurrence with the XGBoost Method: The Clinicopathological Feature-Based Approach. Journal of Computer Networks, Architecture and High Performance Computing, 6(3), 1035–1045. https://doi.org/10.47709/CNAHPC.V6I3.4101
Arista, R. D., Karima, K., Anugrah, M. F., Widyastuti, P., & Triani, E. (2023). Thyroid Cancer?: an Overview of Epidemiology, Risk Factor, and Treatment. Lombok Medical Journal, 2(3), 90–96. https://doi.org/10.29303/LMJ.V2I2.2791
Aulia, N., Kasprata, H. N., Priyahita, P. D., Syahla, T., & Triani, E. (2023). Clinical Diagnosis and Management of Thyroid Cancer. Jurnal Kedokteran (Unram Medical Journal), 12(3), 240–246. https://doi.org/10.29303/JK.V12I3.4493
Carmona, P., Dwekat, A., & Mardawi, Z. (2022). No more black boxes! Explaining the predictions of a machine learning XGBoost classifier algorithm in business failure. Research in International Business and Finance, 61, 101649. https://doi.org/10.1016/J.RIBAF.2022.101649
Feng, Y., Hu, Y., Li, T., Li, M., & Zhang, M. (2026). XGBoost-Cox modeling with SHAP analysis for survival prediction in ovarian cancer patients: a retrospective cohort study. BMC Cancer 2026 26:1, 26(1), 573-. https://doi.org/10.1186/S12885-026-15921-7
Gong, X., Zheng, B., Xu, G., Chen, H., & Chen, C. (2021). Application of machine learning approaches to predict the 5-year survival status of patients with esophageal cancer. Journal of Thoracic Disease, 13(11), 6240. https://doi.org/10.21037/JTD-21-1107
Gunasekara, N., Pfahringer, · Bernhard, Gomes, · Heitor, Bifet, · Albert, Pfahringer, B., Gomes, H., Bifet, A., & Nz, A. (2024). Gradient boosted trees for evolving data streams. Machine Learning 2024 113:5, 113(5), 3325–3352. https://doi.org/10.1007/S10994-024-06517-Y
Hanani, A. A., Donmez, T. B., Kutlu, M., & Mansour, M. (2025). Predicting thyroid cancer recurrence using supervised CatBoost A SHAP-based explainable AI approach. Medicine (United States), 104(22). https://doi.org/10.1097/MD.0000000000042667
Hu, G., Huang, B., Cai, L., Zhang, Y., Zhang, Y., Liu, Y., & Wu, G. (2026). Machine learning prediction of thyroid cancer recurrence for early screening and clinical decision pathways: a retrospective cohort study. Discover Oncology 2026 17:1, 17(1), 239-. https://doi.org/10.1007/S12672-025-04293-2
Istiwana, A. P., Sani, R. R., & Pramudi, Y. T. C. (2026). Pendekatan Explainable Machine Learning Untuk Analisis Faktor Drop Out Mahasiswa Menggunakan XGBoost. Rabit?: Jurnal Teknologi Dan Sistem Informasi Univrab, 11(1), 1074–1083. https://doi.org/10.36341/RABIT.V11I1.7218
Jiang, H., Ji, L., Zhu, L., Wang, H., & Mao, F. (2025). XGBoost model for predicting erectile dysfunction risk after radical prostatectomy: development and validation using machine learning. Discover Oncology 2025 16:1, 16(1), 810-. https://doi.org/10.1007/S12672-025-02685-Y
Nugraha, W., Sabaruddin, R., Abdul Rahman Saleh No, J., Belitung Laut, B., Pontianak Tenggara, K., Pontianak, K., & Barat, K. (2025). Evaluasi Performa Algoritma Klasifikasi dalam Prediksi Kekambuhan Kanker Tiroid Pasca Terapi RAI: Studi Kasus Dataset RAI Therapy. Teknik: Jurnal Ilmu Teknik Dan Informatika, 5(1), 27–35. https://doi.org/10.51903/TEKNIK.V5I1.717
Ramadhan, M. E., & Zeniarja, J. (2025). Implementation of Deep Transfer Learning and Explainable AI in Skin Cancer Classification. Sistemasi: Jurnal Sistem Informasi, 14(5), 2266–2279. https://doi.org/10.32520/STMSI.V14I5.5425
Redlich, A., Pfaehler, E., Kunstreich, M., Schmutz, M., Lapa, C., & Kuhlen, M. (2026). Machine Learning Prediction of Recurrence in Pediatric Thyroid Cancer: Malignant Endocrine Tumors Cohort Analysis Using XGBoost and SHAP. The Journal of Clinical Endocrinology & Metabolism, 111(3), e844–e852. https://doi.org/10.1210/CLINEM/DGAF487
Schindele, A., Krebold, A., Heiß, U., Nimptsch, K., Pfaehler, E., Berr, C., Bundschuh, R. A., Wendler, T., Kertels, O., Tran-Gia, J., Pfob, C. H., & Lapa, C. (2025). Interpretable machine learning for thyroid cancer recurrence predicton: Leveraging XGBoost and SHAP analysis. European Journal of Radiology, 186, 112049. https://doi.org/10.1016/J.EJRAD.2025.112049
Takwim, A., & Sulaeman, H. (2025). Explainable AI Sebagai Solusi Black Box Effect Dalam Kecerdasan Buatan. SEMINAR TEKNOLOGI MAJALENGKA (STIMA), 9, 622–627. https://doi.org/10.48550/ARXIV.2409.00265
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Tuti Alawiyah, Taufik Wibisono, Recha Abriana Anggraini, Bambang Kelana Simpony, Yesti Siti Nurjanah

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.











