Title:
Identifying regional dialects in online social media

Thumbnail Image
Author(s)
Eisenstein, Jacob
Authors
Advisor(s)
Advisor(s)
Editor(s)
Associated Organization(s)
Organizational Unit
Organizational Unit
Series
Supplementary to
Abstract
Electronic social media offers new opportunities for informal communication in written language, while at the same time, providing new datasets that allow researchers to document dialect variation from records of natural communication among millions of individuals. The unprecedented scale of this data enables the application of quantitative methods to automatically discover the lexical variables that distinguish the language of geographical areas such as cities. This can be paired with the segmentation of geographical space into dialect regions, within the context of a single joint statistical model | thus simultaneously identifying coherent dialect regions and the words that distinguish them. Finally, a diachronic analysis reveals rapid changes in the geographical distribution of these lexical features, suggesting that statistical analysis of social media may offer new insights on the diffusion of lexical change.
Sponsor
Date Issued
2014-08-06
Extent
Resource Type
Text
Resource Subtype
Book Chapter
Rights Statement
Rights URI