Aviation-BERT: A Preliminary Aviation-Specific Natural Language Model

Author(s)
Chandra, Chetan
Jing, Xiao
Sawant, Kshitij
Elias, Lidya R.
Kirby, Michelle
Advisor(s)
Editor(s)
Associated Organization(s)
Series
Supplementary to:
Abstract
Data-driven methods form the frontier of reactive aviation safety analysis. While analysis of quantitative data from flight operations is common, text narratives of accidents and incidents have not been sufficiently mined. Among the many use cases of aviation text-data mining, automatically extracting safety concepts is probably the most important. Bidirectional EncoderRepresentations from Transformers (BERT) is a transformer-based large language model that is openly available and has been adapted to numerous domain-specific tasks. The present work provides a comprehensive methodology to develop domain-specific BERT model starting from the base model. A preliminary aviation domain-specific BERT model is developed in this work. This Aviation-BERT model is pre-trained from the BERT-Base model using accident and incident text narratives from the National Transportation Safety Board (NTSB) and AviationSafety Reporting System (ASRS) using mixed-domain pre-training. Aviation-BERT is shown to outperform BERT when it comes to text-mining tasks on aviation text datasets. It is also expected to be of tremendous value in numerous downstream tasks in the analysis of aviation text corpora.
Sponsor
US Federal Aviation Administration (FAA)
Date
2023-06
Extent
Resource Type
Text
Resource Subtype
Paper
Rights Statement
Rights URI