An Empirical Analysis of Crash Injury Severity Among Young Drivers in England: Accounting for Data Imbalance

Amirhossein Taheri, Kevin Switala, Grigorios Fountas, Abbas Sheykhfard, Nima Dadashzadeh, Steffen Müller

Research output: Contribution to journalArticlepeer-review

Abstract

Crash data analysis is key to improving road safety, but imbalanced data challenges accurate predictions for severe crashes, often leading to biased outcomes. This study investigates crash severity among young drivers (aged 17–24) in England, using crash data collected between April 2019 and February 2022. To address the imbalance issue, the performance of a standard classification and regression tree (CART) model is compared with a modified approach—random undersampling of the majority class CART (RUMC-CART). Although RUMC-CART yields slightly lower accuracy, it demonstrates superior performance in identifying severe crashes. Key contributing factors—categorized as type of vehicle and vulnerabilities, number of vehicles and casualties, area type (urban vs. rural), vehicle maneuvers and dynamic factors, and minor influences and timeline—are shown to significantly impact injury severity outcomes among young drivers. The findings of the study provide valuable insights for developing targeted interventions to enhance road safety.
Original languageEnglish
Article number4793
Number of pages22
JournalApplied Sciences (Switzerland)
Volume15
Issue number9
Early online date25 Apr 2025
DOIs
Publication statusPublished - 1 May 2025

Cite this