TY - JOUR
T1 - New Avenues for Automated Railway Safety Information Processing in Enterprise Architecture
T2 - An NLP Approach
AU - Qurashi, Abdul Wahab
AU - Farhat, Zohaib A.
AU - Holmes, Violeta
AU - Johnson, Anju P.
N1 - Funding Information:
This work was supported by the Institute of Railway Research, University of Huddersfield, and the Rail Safety and Standards Board (RSSB).
Publisher Copyright:
© 2013 IEEE.
PY - 2023/5/10
Y1 - 2023/5/10
N2 - Enterprise Architecture (EA) is crucial in any organisation as it defines the basic building blocks of a business. It is typically presented as a set of documents that help all departments understand the business model. In EA, safety documents are used to manage and understand safety risks. A novel similarity system for railway safety document processing is presented in this work. It measures the feasibility of automated updating of EA models with the Rule Book by verifying whether Rail Safety and Standards Board (RSSB's) Rule Book clauses are present and complete in existing EA models. Additionally, a Natural Language Processing (NLP) based search feature was developed to drill through the database to find similar existing rules, principles, and clauses based on semantic similarity. The result will display the most similar clauses and rules with similarity scores and document names. In this study, different pre-trained Electra Small, DistilBERT (Distillation Bidirectional Encoder Representations from Transformers) Base and BERT (Bidirectional Encoder Representations from Transformers) Base were used to embed text. Additionally, the similarity between document rules was measured by cosine similarity metrics. With conclusive evidence, our findings show that BERT Base exceeds the other embedding methods in the semantic comparison of documents.
AB - Enterprise Architecture (EA) is crucial in any organisation as it defines the basic building blocks of a business. It is typically presented as a set of documents that help all departments understand the business model. In EA, safety documents are used to manage and understand safety risks. A novel similarity system for railway safety document processing is presented in this work. It measures the feasibility of automated updating of EA models with the Rule Book by verifying whether Rail Safety and Standards Board (RSSB's) Rule Book clauses are present and complete in existing EA models. Additionally, a Natural Language Processing (NLP) based search feature was developed to drill through the database to find similar existing rules, principles, and clauses based on semantic similarity. The result will display the most similar clauses and rules with similarity scores and document names. In this study, different pre-trained Electra Small, DistilBERT (Distillation Bidirectional Encoder Representations from Transformers) Base and BERT (Bidirectional Encoder Representations from Transformers) Base were used to embed text. Additionally, the similarity between document rules was measured by cosine similarity metrics. With conclusive evidence, our findings show that BERT Base exceeds the other embedding methods in the semantic comparison of documents.
KW - cosine similarity
KW - distillation bidirectional encoder representations from transformers
KW - enterprise architecture models
KW - Natural language processing
UR - http://www.scopus.com/inward/record.url?scp=85159660415&partnerID=8YFLogxK
U2 - 10.1109/ACCESS.2023.3272610
DO - 10.1109/ACCESS.2023.3272610
M3 - Article
AN - SCOPUS:85159660415
VL - 11
SP - 44413
EP - 44424
JO - IEEE Access
JF - IEEE Access
SN - 2169-3536
M1 - 10114393
ER -