Predicting motor insurance claim incidence using generalized and tree-based models: A comparative statistical approach

Eslam Abdelhakim Seyam

doi:http://dx.doi.org/10.21511/ins.16(2).2025.04

Predicting motor insurance claim incidence using generalized and tree-based models: A comparative statistical approach

Received May 10, 2025;

Accepted August 7, 2025;

Published August 14, 2025
Author(s)

Link to ORCID Index: https://orcid.org/0000-0002-2487-5106

Eslam Abdelhakim Seyam
DOI
http://dx.doi.org/10.21511/ins.16(2).2025.04
Article Info
Volume 16 2025, Issue #2, pp. 38-53
TO CITE АНОТАЦІЯ
Cited by
1 articles

629 Views
553 Downloads

This work is licensed under a Creative Commons Attribution 4.0 International License

Type of the article: Research Article

Abstract
Accurate prediction of motor insurance claim frequency is necessary for efficient risk management, underwriting, and policy pricing. Predictive performance of Poisson Generalized Linear Models (GLMs), Decision Trees, and Generalized Additive Models (GAMs) is investigated using 108,699 motor third-party liability insurance contracts, representing the French Motor TPL dataset from the CASdatasets R package widely used in actuarial research. These models’ predictability, explainability, and flexibility on training and testing sets are compared using Mean Squared Error (MSE), Root Mean Squared Error (RMSE), and Poisson Deviance metrics. Results indicate that, although GLM offers an interpretable, accurate baseline, GAM slightly surpasses GLM and Decision Trees under all performance measures. Results demonstrate that GAM achieves superior performance across all metrics, with the lowest MSE (0.0506), RMSE (0.2251), and Poisson Deviance (36.41% training, 37.76% test), compared to GLM (MSE: 0.0509, RMSE: 0.2257, Poisson Deviance: 36.83% training, 38.08% test) and Decision Trees (MSE: 0.0582, RMSE: 0.2413, Poisson Deviance: 37.12% training, 38.31% test). The GAM model reduces prediction error by approximately 0.6% compared to GLM and 13.1% compared to Decision Trees based on MSE. Empirical findings reveal how GAMs achieve an optimum balance between model explainability and prediction flexibility, rendering them best suited for insurers who want to refine risk segmentation without compromising on regulatory compliance and business transparency. This study joins other research calling for interpretable state-of-the-art statistical techniques in insurance analytics and presents worthwhile observations for actuaries and data scientists who wish to refine motor insurance frequency modeling frameworks.

view full abstract

Keywords
claim frequency, decision trees, generalized additive models, generalized linear models, motor insurance, predictive modeling
JEL Classification (Paper profile tab)
C25, C53, G22, C14, C52
References
37
Tables
7
Figures
5

- Figure 1. Claim frequency by vehicle age and Bonus-Malus level
- Figure 2. Claim frequency by driver age and Bonus-Malus level
- Figure 3. Decision Tree for claim frequency
- Figure 4. GAM smooth functions for vehicle age and driver age
- Figure 5. GAM smooth functions for Bonus-Malus by driver age group

- Table 1. Summary of dataset variables and their measurement scales
- Table 2. Summary of descriptive statistics for key variables
- Table 3. Decision Tree splits for claim frequency, with node sample sizes, deviances, and mean claim frequencies
- Table 4. Poisson GLM regression results: estimated coefficients for claim frequency
- Table 5. Parametric coefficient estimates from the GAM for claim frequency
- Table 6. Approximate significance of smooth terms in the GAM
- Table 7. Model performance comparison for claim frequency prediction

- Conceptualization
  Eslam Abdelhakim Seyam
- Data curation
  Eslam Abdelhakim Seyam
- Formal Analysis
  Eslam Abdelhakim Seyam
- Funding acquisition
  Eslam Abdelhakim Seyam
- Investigation
  Eslam Abdelhakim Seyam
- Methodology
  Eslam Abdelhakim Seyam
- Project administration
  Eslam Abdelhakim Seyam
- Resources
  Eslam Abdelhakim Seyam
- Software
  Eslam Abdelhakim Seyam
- Supervision
  Eslam Abdelhakim Seyam
- Validation
  Eslam Abdelhakim Seyam
- Visualization
  Eslam Abdelhakim Seyam
- Writing – original draft
  Eslam Abdelhakim Seyam
- Writing – review & editing
  Eslam Abdelhakim Seyam

< Prev Next >

Download Preview

Insurance Markets and Companies

Please specify your request here

Predicting motor insurance claim incidence using generalized and tree-based models: A comparative statistical approach

COOKIES
The cookie settings on this website are set to 'allow all cookies' to give you the best experience.

Please specify your request here

Predicting motor insurance claim incidence using generalized and tree-based models: A comparative statistical approach

COOKIES The cookie settings on this website are set to 'allow all cookies' to give you the best experience.

COOKIES
The cookie settings on this website are set to 'allow all cookies' to give you the best experience.