Title:
Character Eyes: Seeing Language through Character-Level Taggers

dc.contributor.advisor Eisenstein, Jacob
dc.contributor.author Marone, Marc
dc.contributor.committeeMember Riedl, Mark
dc.contributor.department Computer Science
dc.date.accessioned 2019-02-12T14:42:59Z
dc.date.available 2019-02-12T14:42:59Z
dc.date.created 2018-12
dc.date.issued 2018-12
dc.date.submitted December 2018
dc.date.updated 2019-02-12T14:42:59Z
dc.description.abstract Character-level models have been used extensively in recent years in NLP tasks as both supplements and replacements for closed-vocabulary token-level word representations. In one popular architecture, character-level RNNs, typically LSTMs, form a bottom tier creating a word representation for a sequence tagger used to predict token-level annotations such as part-of-speech (POS) tags. In this work, we examine the behavior of POS taggers from the perspective of individual hidden units within the character-level LSTM. Analysis of activation patterns on a macro scale allows us to identify the ways in which the burden of POS detection is spread across the hidden layer in different languages, as a function of their morphological properties. Using ablation tests, we show how different allocations of forward and backward units affect model arrangement and performance in different categories of languages. We use these results to offer heuristics for hyperparameter selection that are based on known linguistic traits.
dc.description.degree Undergraduate
dc.format.mimetype application/pdf
dc.identifier.uri http://hdl.handle.net/1853/60890
dc.language.iso en_US
dc.publisher Georgia Institute of Technology
dc.subject Natural language processing
dc.subject Character level models
dc.subject Part of speech tagging
dc.subject Computational linguistics
dc.title Character Eyes: Seeing Language through Character-Level Taggers
dc.type Text
dc.type.genre Undergraduate Thesis
dspace.entity.type Publication
local.contributor.advisor Eisenstein, Jacob
local.contributor.corporatename College of Computing
local.contributor.corporatename School of Computer Science
local.contributor.corporatename Undergraduate Research Opportunities Program
local.relation.ispartofseries Undergraduate Research Option Theses
relation.isAdvisorOfPublication d2334908-9b54-40ce-9a5b-26987819dd65
relation.isOrgUnitOfPublication c8892b3c-8db6-4b7b-a33a-1b67f7db2021
relation.isOrgUnitOfPublication 6b42174a-e0e1-40e3-a581-47bed0470a1e
relation.isOrgUnitOfPublication 0db885f5-939b-4de1-807b-f2ec73714200
relation.isSeriesOfPublication e1a827bd-cf25-4b83-ba24-70848b7036ac
thesis.degree.level Undergraduate
Files
Original bundle
Now showing 1 - 1 of 1
Thumbnail Image
Name:
MARONE-UNDERGRADUATERESEARCHOPTIONTHESIS-2018.pdf
Size:
260.5 KB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
LICENSE.txt
Size:
3.86 KB
Format:
Plain Text
Description: