Abstract
Crash data analysis is key to improving road safety, but imbalanced data challenges accurate predictions for severe crashes, often leading to biased outcomes. This study investigates crash severity among young drivers (aged 17–24) in England, using crash data collected between April 2019 and February 2022. To address the imbalance issue, the performance of a standard classification and regression tree (CART) model is compared with a modified approach—random undersampling of the majority class CART (RUMC-CART). Although RUMC-CART yields slightly lower accuracy, it demonstrates superior performance in identifying severe crashes. Key contributing factors—categorized as type of vehicle and vulnerabilities, number of vehicles and casualties, area type (urban vs. rural), vehicle maneuvers and dynamic factors, and minor influences and timeline—are shown to significantly impact injury severity outcomes among young drivers. The findings of the study provide valuable insights for developing targeted interventions to enhance road safety.
Original language | English |
---|---|
Article number | 4793 |
Number of pages | 22 |
Journal | Applied Sciences (Switzerland) |
Volume | 15 |
Issue number | 9 |
Early online date | 25 Apr 2025 |
DOIs | |
Publication status | Published - 1 May 2025 |