Title:
Nonnegative matrix and tensor factorizations, least squares problems, and applications

dc.contributor.advisor Park, Haesun
dc.contributor.author Kim, Jingu en_US
dc.contributor.committeeMember Gray, Alexander
dc.contributor.committeeMember Lebanon, Guy
dc.contributor.committeeMember Monteiro, Renato
dc.contributor.committeeMember Zha, Hongyuan
dc.contributor.department Computing en_US
dc.date.accessioned 2012-02-17T19:25:30Z
dc.date.available 2012-02-17T19:25:30Z
dc.date.issued 2011-11-14 en_US
dc.description.abstract Nonnegative matrix factorization (NMF) is a useful dimension reduction method that has been investigated and applied in various areas. NMF is considered for high-dimensional data in which each element has a nonnegative value, and it provides a low-rank approximation formed by factors whose elements are also nonnegative. The nonnegativity constraints imposed on the low-rank factors not only enable natural interpretation but also reveal the hidden structure of data. Extending the benefits of NMF to multidimensional arrays, nonnegative tensor factorization (NTF) has been shown to be successful in analyzing complicated data sets. Despite the success, NMF and NTF have been actively developed only in the recent decade, and algorithmic strategies for computing NMF and NTF have not been fully studied. In this thesis, computational challenges regarding NMF, NTF, and related least squares problems are addressed. First, efficient algorithms of NMF and NTF are investigated based on a connection from the NMF and the NTF problems to the nonnegativity-constrained least squares (NLS) problems. A key strategy is to observe typical structure of the NLS problems arising in the NMF and the NTF computation and design a fast algorithm utilizing the structure. We propose an accelerated block principal pivoting method to solve the NLS problems, thereby significantly speeding up the NMF and NTF computation. Implementation results with synthetic and real-world data sets validate the efficiency of the proposed method. In addition, a theoretical result on the classical active-set method for rank-deficient NLS problems is presented. Although the block principal pivoting method appears generally more efficient than the active-set method for the NLS problems, it is not applicable for rank-deficient cases. We show that the active-set method with a proper starting vector can actually solve the rank-deficient NLS problems without ever running into rank-deficient least squares problems during iterations. Going beyond the NLS problems, it is presented that a block principal pivoting strategy can also be applied to the l1-regularized linear regression. The l1-regularized linear regression, also known as the Lasso, has been very popular due to its ability to promote sparse solutions. Solving this problem is difficult because the l1-regularization term is not differentiable. A block principal pivoting method and its variant, which overcome a limitation of previous active-set methods, are proposed for this problem with successful experimental results. Finally, a group-sparsity regularization method for NMF is presented. A recent challenge in data analysis for science and engineering is that data are often represented in a structured way. In particular, many data mining tasks have to deal with group-structured prior information, where features or data items are organized into groups. Motivated by an observation that features or data items that belong to a group are expected to share the same sparsity pattern in their latent factor representations, We propose mixed-norm regularization to promote group-level sparsity. Efficient convex optimization methods for dealing with the regularization terms are presented along with computational comparisons between them. Application examples of the proposed method in factor recovery, semi-supervised clustering, and multilingual text analysis are presented. en_US
dc.description.degree PhD en_US
dc.identifier.uri http://hdl.handle.net/1853/42909
dc.publisher Georgia Institute of Technology en_US
dc.subject Linear complementarity problem en_US
dc.subject Parallel factorization en_US
dc.subject Canonical decomposition en_US
dc.subject Active set method en_US
dc.subject Rank deficiency en_US
dc.subject l1-regularized linear regression en_US
dc.subject Mixed-norm regularization en_US
dc.subject Low rank approximation en_US
dc.subject Block principal pivoting en_US
dc.subject Nonnegativity constrained least squares en_US
dc.subject.lcsh Computer science
dc.subject.lcsh Matrices
dc.subject.lcsh Least squares
dc.title Nonnegative matrix and tensor factorizations, least squares problems, and applications en_US
dc.type Text
dc.type.genre Dissertation
dspace.entity.type Publication
local.contributor.advisor Park, Haesun
local.contributor.corporatename College of Computing
local.relation.ispartofseries Doctor of Philosophy with a Major in Computer Science
relation.isAdvisorOfPublication 92013a6f-96b2-4ca8-9ef7-08f408ec8485
relation.isOrgUnitOfPublication c8892b3c-8db6-4b7b-a33a-1b67f7db2021
relation.isSeriesOfPublication 41e6384f-fa8d-4c63-917f-a26900b10f64
Files
Original bundle
Now showing 1 - 1 of 1
Thumbnail Image
Name:
kim_jingu_201112_phd.pdf
Size:
1.41 MB
Format:
Adobe Portable Document Format
Description: