TY - JOUR
T1 - Application of text mining techniques to identify actual wrong-way driving (WWD) crashes in police reports
AU - Hosseini, Parisa
AU - Khoshsirat, Seyedalireza
AU - Jalayer, Mohammad
AU - Das, Subasish
AU - Zhou, Huaguo
N1 - Publisher Copyright:
© 2022 Tongji University and Tongji University Press
PY - 2023
Y1 - 2023
N2 - Wrong-way driving (WWD) has been a long-lasting issue for transportation agencies and law enforcement, since it causes pivotal threats to road users. Notwithstanding being rare, crashes occurring due to WWD are more severe than other types of crashes. In order to analyze WWD crashes, there is a need to obtain WWD incidents or crash data. However, it is time-consuming to identify actual WWD crashes from potential WWD crashes in large crash databases. It often involves large man-hours to review hardcopy of crash narratives in the police reports. Otherwise, it may cause an overestimation or underestimation of WWD crash frequencies. To fill this gap, the present study, as the first-of-its-kind, aims at identifying actual WWD crashes from potential WWD crashes in police reports by using machine learning methods. Recently, Bidirectional Encoder Representations from Transformers (BERT) models have shown promising results in natural language processing. In this study, we implemented the BERT model as well as five conventional classification algorithms, including Naïve Bayes (NB), Support Vector Machine (SVM), Decision Tree (DT), Random Forest (RF), and Single Layer Perceptron (SLP) to classify crash report narratives as actual WWD and non-WWD crashes. Cross-validation and different performance metrics were used to evaluate the performance of each classification algorithm. Results indicated that the BERT model outperformed in identifying actual WWD crashes in comparison with other algorithms with an accuracy of 81.59%. The BERT classification algorithm can be implemented to reduce the time needed to identify actual WWD crashes from crash report narratives.
AB - Wrong-way driving (WWD) has been a long-lasting issue for transportation agencies and law enforcement, since it causes pivotal threats to road users. Notwithstanding being rare, crashes occurring due to WWD are more severe than other types of crashes. In order to analyze WWD crashes, there is a need to obtain WWD incidents or crash data. However, it is time-consuming to identify actual WWD crashes from potential WWD crashes in large crash databases. It often involves large man-hours to review hardcopy of crash narratives in the police reports. Otherwise, it may cause an overestimation or underestimation of WWD crash frequencies. To fill this gap, the present study, as the first-of-its-kind, aims at identifying actual WWD crashes from potential WWD crashes in police reports by using machine learning methods. Recently, Bidirectional Encoder Representations from Transformers (BERT) models have shown promising results in natural language processing. In this study, we implemented the BERT model as well as five conventional classification algorithms, including Naïve Bayes (NB), Support Vector Machine (SVM), Decision Tree (DT), Random Forest (RF), and Single Layer Perceptron (SLP) to classify crash report narratives as actual WWD and non-WWD crashes. Cross-validation and different performance metrics were used to evaluate the performance of each classification algorithm. Results indicated that the BERT model outperformed in identifying actual WWD crashes in comparison with other algorithms with an accuracy of 81.59%. The BERT classification algorithm can be implemented to reduce the time needed to identify actual WWD crashes from crash report narratives.
UR - http://www.scopus.com/inward/record.url?scp=85146124088&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85146124088&partnerID=8YFLogxK
U2 - 10.1016/j.ijtst.2022.12.002
DO - 10.1016/j.ijtst.2022.12.002
M3 - Article
AN - SCOPUS:85146124088
SN - 2046-0430
VL - 12
SP - 1038
EP - 1051
JO - International Journal of Transportation Science and Technology
JF - International Journal of Transportation Science and Technology
IS - 4
ER -