Disentangling neural network representations for improved generalization

Cogswell, Michael Andrew

Title:

Disentangling neural network representations for improved generalization

dc.contributor.advisor	Batra, Dhruv
dc.contributor.author	Cogswell, Michael Andrew
dc.contributor.committeeMember	Parikh, Devi
dc.contributor.committeeMember	Hays, James
dc.contributor.committeeMember	Goel, Ashok
dc.contributor.committeeMember	Lee, Stefan
dc.contributor.department	Interactive Computing
dc.date.accessioned	2020-05-20T17:01:40Z
dc.date.available	2020-05-20T17:01:40Z
dc.date.created	2020-05
dc.date.issued	2020-04-24
dc.date.submitted	May 2020
dc.date.updated	2020-05-20T17:01:40Z
dc.description.abstract	Despite the increasingly broad perceptual capabilities of neural networks, applying them to new tasks requires significant engineering effort in data collection and model design. Generally, inductive biases can make this process easier by leveraging knowledge about the world to guide neural network design. One such inductive bias is disentanglment, which can help preven neural networks from learning representations that capture spurious patterns that do not generalize past the training data, and instead encourage them to capture factors of variation that explain the data generally. In this thesis we identify three kinds of disentanglement, implement a strategy for enforcing disentanglement in each case, and show that more general representations result. These perspectives treat disentanglement as statistical independence of features in image classification, language compositionality in goal driven dialog, and latent intention priors in visual dialog. By increasing the generality of neural networks through disentanglement we hope to reduce the effort required to apply neural networks to new tasks and highlight the role of inductive biases like disentanglement in neural network design.
dc.description.degree	Ph.D.
dc.format.mimetype	application/pdf
dc.identifier.uri	http://hdl.handle.net/1853/62813
dc.language.iso	en_US
dc.publisher	Georgia Institute of Technology
dc.subject	Deep learning
dc.subject	Disentanglement
dc.subject	Compositionality
dc.subject	Representation learning
dc.subject	Visual dialog
dc.subject	Language emergence
dc.title	Disentangling neural network representations for improved generalization
dc.type	Text
dc.type.genre	Dissertation
dspace.entity.type	Publication
local.contributor.advisor	Batra, Dhruv
local.contributor.corporatename	College of Computing
local.contributor.corporatename	School of Interactive Computing
relation.isAdvisorOfPublication	bbee09e1-a4fa-4d99-9dfd-b0605fea0f11
relation.isOrgUnitOfPublication	c8892b3c-8db6-4b7b-a33a-1b67f7db2021
relation.isOrgUnitOfPublication	aac3f010-e629-4d08-8276-81143eeaf5cc
thesis.degree.level	Doctoral