Learning from text-based close call data

Peter Hughes, Miguel Figueres Esteban, Coen Van Gulijk

Research output: Contribution to journalArticlepeer-review


A key feature of big data is the variety of data sources that are available; which include not just numerical data but also image or video data or even free text. The GB railways collects a large volume of free text data daily from railway workers describing close call hazard reports: instances where an accident could have – but did not – occur. These close call reports contain valuable safety information which could be useful in managing safety on the railway, but which can be lost in the very large volume of data – much larger than is viable for a human analyst to read. This paper describes the application of rudimentary natural language processing (NLP) techniques to uncover safety information from close calls. The analysis has proven that basic information extraction is possible using the rudimentary techniques, but has also identified some limitations that arise using only basic techniques. Using these findings further research in this area intends to look at how the techniques that have been proven to date can be improved with the use of more advanced NLP techniques coupled with machine-learning.
Original languageEnglish
Pages (from-to)184-198
Number of pages15
JournalSafety and Reliability
Issue number3
Early online date29 Nov 2016
Publication statusPublished - 2016


Dive into the research topics of 'Learning from text-based close call data'. Together they form a unique fingerprint.

Cite this