Development of a Health Insurance Premium Prediction Model using Machine Learning

Abstract

In Zimbabwe’s evolving healthcare landscape, accurately determining health insurance premiums is critical to improving affordability, reducing risk imbalances, and increasing coverage, particularly amid economic constraints and rising health costs. Traditional actuarial models often struggle to represent the complex, non-linear relationships among socioeconomic, health, and lifestyle variables prevalent in the Zimbabwean population. This paper aims to develop a machine learning model that more precisely and rationally predicts health insurance premiums. Five supervised regression algorithms, Linear Regression (LR), LASSO Regression (LASSO), K-Nearest Neighbours (KNN), Random Forest (RF), and Gradient Boosting (GB), are evaluated for their effectiveness using a representative health insurance dataset that includes demographic and health-related attributes relevant to Zimbabwe. Models were assessed based on their Mean Squared Error (MSE), Root Mean Squared Error (RMSE), and R-squared (R2) values. The results show that ensemble learning methods, particularly Gradient Boosting, significantly outperform traditional linear models, achieving the highest predictive accuracy. Key predictors of premium costs were identified as chronic illnesses, smoking status, and the number of dependents, variables that are particularly pertinent in local risk assessment. This paper advances health insurance analytics in Zimbabwe by providing evidence that machine learning can support more transparent, data-driven, and context-sensitive premium determination. The findings help insurers, policymakers, and healthcare stakeholders aiming to expand coverage and improve trust in private and public insurance schemes.

Description

Keywords

Health Insurance, Premium Prediction, Machine Learning, Ensemble Method, Linear Regression (LR), LASSO Regression (LASSO), K-Nearest Neighbours (KNN), Random Forest (RF), and Gradient Boosting (GB)

Citation

Collections

Endorsement

Review

Supplemented By

Referenced By