Development of a Health Insurance Premium Prediction Model using Machine Learning

Makoni Tendai; Rukwava Caroline; Mawere Talent; Chinofunga Peter Tinashe

Development of a Health Insurance Premium Prediction Model using Machine Learning

Files

Development of a Health Insurance Premium Prediction Model using Machine Learning.pdf (716.74 KB)

Date

2025

Authors

Makoni Tendai

Rukwava Caroline

Mawere Talent

Chinofunga Peter Tinashe

Publisher

Great Zimbabwe University

Abstract

In Zimbabwe’s evolving healthcare landscape, accurately determining health insurance premiums is critical to improving affordability, reducing risk imbalances, and increasing coverage, particularly amid economic constraints and rising health costs. Traditional actuarial models often struggle to represent the complex, non-linear relationships among socioeconomic, health, and lifestyle variables prevalent in the Zimbabwean population. This paper aims to develop a machine learning model that more precisely and rationally predicts health insurance premiums. Five supervised regression algorithms, Linear Regression (LR), LASSO Regression (LASSO), K-Nearest Neighbours (KNN), Random Forest (RF), and Gradient Boosting (GB), are evaluated for their effectiveness using a representative health insurance dataset that includes demographic and health-related attributes relevant to Zimbabwe. Models were assessed based on their Mean Squared Error (MSE), Root Mean Squared Error (RMSE), and R-squared (R2) values. The results show that ensemble learning methods, particularly Gradient Boosting, significantly outperform traditional linear models, achieving the highest predictive accuracy. Key predictors of premium costs were identified as chronic illnesses, smoking status, and the number of dependents, variables that are particularly pertinent in local risk assessment. This paper advances health insurance analytics in Zimbabwe by providing evidence that machine learning can support more transparent, data-driven, and context-sensitive premium determination. The findings help insurers, policymakers, and healthcare stakeholders aiming to expand coverage and improve trust in private and public insurance schemes.

Keywords

Health Insurance, Premium Prediction, Machine Learning, Ensemble Method, Linear Regression (LR), LASSO Regression (LASSO), K-Nearest Neighbours (KNN), Random Forest (RF), and Gradient Boosting (GB)

URI

https://dzimbahwehub.gzu.ac.zw/handle/123456789/1046

Collections

Articles

Full item page

Development of a Health Insurance Premium Prediction Model using Machine Learning

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By