Text mining of the securities and exchange commission financial filings of publicly traded construction firms using deep learning to identify and assess risk

Thumbnail Image
Jallan, Yashovardhan
Ashuri, Baabak
Tien, Iris
Song, Xinyi
Marks, Eric
Clevenger, Caroline
Associated Organization(s)
Supplementary to
Risk factor identification has been a critical topic in the construction industry. It is vital for the various construction firms and industry stakeholders to carefully understand different types of risks that affect their business and financial bottom line. The current research creates a systematic methodology implementing a new set of text mining methods to identify and classify risk types affecting the publicly traded construction companies, by leveraging their 10-K reports filed with the Securities and Exchange Commission (SEC). A structured procedure is developed to apply advancements from text mining and natural language processing (NLP) to extract information from textual disclosures. A state-of-the-art deep learning algorithm named FastText is implemented to identify risk patterns and classify the text into appropriate risk types. Key findings show that operational and financial risks associated with doing business is most commonly disclosed in the risk disclosures filed by the publicly-traded construction firms. A steady monotonic increase is found in the average number of total risk disclosures from 2006 to 2018. Over the same period, a growth is seen in the proportion of technology risks, reputation/intangible assets risks, financial markets risk and third-party risks The primary contributions of this research are: (a) development of a new methodology which serves as a risk thermometer for identification and quantification of risk at an individual company level, sub-industry level, and the overall industry level; and (b) minimization of any existing information asymmetry in risk studies by utilization of a source of data that have not been previously used by construction researchers. It is anticipated that the developed methodology and its results can be used by: (i) publicly-traded construction companies to understand risks affecting themselves and their peers; (ii) surety bond companies and insurance providers to supplement their risk pricing models; and (iii) equity investors and capital financial institutions to make more informed risk-based decisions for their investments in the construction business.
Date Issued
Resource Type
Resource Subtype
Rights Statement
Rights URI