Series
Doctor of Philosophy with a Major in Mathematics
Doctor of Philosophy with a Major in Mathematics
Permanent Link
Series Type
Degree Series
Description
Associated Organization(s)
Associated Organization(s)
Organizational Unit
338 results
Publication Search Results
Now showing
1 - 10 of 338
-
ItemQuantitative convergence analysis of dynamical processes in machine learning(Georgia Institute of Technology, 2024-07-27) Wang, YuqingThis thesis focuses on analyzing the quantitative convergence of selected important machine learning processes, from a dynamical perspective, in order to understand and guide machine learning practices. Machine learning is becoming increasingly popular in various fields. Typical machine learning models consist of optimization and generalization processes, where the performance of these two depends on the network architectures, algorithms, learning rate, batch size, training strategies, etc. Many of these processes and designs utilize, either explicitly or implicitly, (nonlinear) dynamics and can be theoretically understood via (more) refined convergence analysis. More precisely: The first part of this thesis illustrates the effect of a large learning rate on optimization dynamics, which often correlates with improved generalization. Specifically, we consider non-convex and non-Lipschitz-smooth potential functions in matrix factorization problems minimized by gradient descent (GD) with large learning rates, which is beyond the scope of classical optimization theory. We develop a new convergence analysis to show that the large learning rate biases GD towards flatter minima, where the two factors in the matrix factorization objective are more balanced. The second part is an extension of the theory in the first part to a unified mechanism of several implicit biases including edge of stability, balancing, and catapult. We broaden the previous convergence analysis to a family of objective functions with various regularities, where good regularities combined with large learning rates result in the occurrence of these phenomena. In the third part, we concentrate on diffusion models, which is a concrete and important real-world application, and theoretically demonstrate how to choose its hyperparameters for good performance through the convergence analysis of the full generation process, including optimization and sampling. It turns out that our theory is consistent with practical usage with leading empirical results. In the fourth part of this thesis, we study the generalization performance of different architectures, deep residual networks (ResNets), and deep feedforward networks (FFNets). By taking these architectures as iterative maps and analyzing their convergence via neural tangent kernel, we prove that deep ResNets can effectively separate data while deep FFNets degenerate and lose their learnability.
-
ItemCausal Discovery from Observational Data in the Presence of Latent Confounders and Other Data Complexities(Georgia Institute of Technology, 2024-07-27) Yang, YuqinCausal discovery aims to recover causal relationships among variables of interest in the system. In the situations where interventions (controlled experiments) on system variables are not allowed, causal discovery from only observational data has been studied, which either utilizes the conditional independence relations among observed variables, or asserts additional semi-parametric assumptions on the underlying model. However, there are complexities in real-life data that make causal discovery even more challenging. Some of the main sources of data complexity include: (i) Latent confounding, where there may exist unobserved variables that affect more than one observed variables in the system; (ii) Deterministic relations, where one observed variable may be fully dependent on other observed variables in the system; (iii) Measurement error, where we may not observe a exact value of the variables, but rather a corrupted version of them; (iv) Data heterogeneity, where the data are collected from multiple domains and do not follow the same distribution. The majority of causal discovery methods assume that these complexities are absent in the system. Naturally, naive applications of these approaches to the settings that indeed are subject to data complexity issues lead to detecting spurious or erroneous causal links among variables of interest. The focus of the dissertation is on developing causal discovery methods that are capable of handling these data complexities. Specifically, -We study the problem of causal discovery in linear causal models with deterministic relations and latent confounding. We provide necessary and sufficient conditions for unique identifiability of the model under separability condition (i.e., the matrix indicating the independent exogenous noise terms pertaining to the observed variables is identifiable). -We study the problem of causal discovery in linear causal models in the presence of latent confounding and/or measurement error. We characterize the extent of identifiability of the model under separability condition together with two versions of faithfulness assumptions. We provide graphical characterization of the models that are observationally equivalent. -We study the problem of learning the unknown intervention targets in linear or nonlinear causal models from a collection of interventional data obtained from multiple environments. We propose LIT algorithm which allows latent confounders to be intervention targets. Our theoretical analysis shows that LIT algorithm gives a more accurate estimate of the intervention target set than previous works.
-
ItemGraphs and geometry: an interplay between local and global views(Georgia Institute of Technology, 2024-07-27) Yu, JingIn this dissertation, we explore problems related to graphs and geometry. This work consists of two projects, and they are independent and utilize distinct proof techniques. However, they share a common underlying philosophy: we alternate between local and global perspectives as required. In Project I, we investigate the large-scale geometry of Borel graphs of polynomial growth. Krauthgamer and Lee showed that every connected graph of polynomial growth admits an injective contraction mapping to $(\mathbb{Z}^n, \|\cdot\|_\infty)$ for some $n \in \mathbb{N}$. We strengthen and generalize this result in a number of ways. In particular, answering a question of Papasoglu, we construct coarse embeddings from graphs of polynomial growth to $\mathbb{Z}^n$. Furthermore, we extend these results to Borel graphs. Namely, we show that graphs generated by free Borel actions of $\mathbb{Z}^n$ are in a certain sense universal for the class of Borel graphs of polynomial growth. This provides a general method for extending results about $\mathbb{Z}^n$-actions to all Borel graphs of polynomial growth. For example, an immediate consequence of our main result is that all Borel graphs of polynomial growth are hyperfinite, which answers a well-known question in the area. Additionally, our results yield nice applications in graph minor theory. In Project II, we investigate outerplanar graphs with positive Lin–Lu–Yau curvature. we show that all simple outerplanar graphs with minimum degree at least 2 and positive Lin-Lu-Yau curvature on every edge have maximum degree at most 9. Furthermore, if G is maximally outerplanar, then G has at most 10 vertices. Both upper bounds are sharp.
-
ItemConstructing a Random Model for the Action of Frobenius on Fundamental Groups of Curves(Georgia Institute of Technology, 2024-07-27) Afton, SantanaFor a smooth proper curve $X$ defined over $\mathbb{F}_p$, work of de Jong and Gaitsgory have shown that any representation \vspace{-2mm} \[ \pi_1(X) \to \mathrm{GL}_n(\mathbb{F}_{\ell}(\!(t)\!)) \] has finite image when restricted to $\pi_1(\overline{X})$. We create a random model of $\pi_1(X)$ by utilizing non-geometric automorphisms of $\pi_1(\overline{X})$, and prove that such behavior is generic when the image is solvable. Furthermore, this is proven using results we obtain about pro-$\ell$ free groups and pro-$\ell$ surface groups solely using group-theoretic techniques combined with the Weil conjectures. We end by discussing connections this work has to various topics in arithmetic geometry and geometric group theory.
-
ItemAntigentic cooperation and cross-immunoreactivity(Georgia Institute of Technology, 2024-07-23) Ram Sreedharan Nair, AthulyaThe evolution of RNA viruses such as Hepatitis C, HIV, and Influenza are characterized by high mutation rates that lead to a quasi-species of viral variants within each host, forming a complex ecosystem under the influence of the host immune system. Challenging traditional models that view viral evolution as a straightforward immune escape mechanism, we use a novel nonlinear ordinary differential equation model that emphasizes antigenic cooperation and cross-reactivity of host immune response. Under cross-immunoreactivity, viral variants assume different roles, most importantly persistent variants who persist through the infection by hiding from the host immune system and altruistic variants who draw the immune response to themselves and help shield the persistent variants. The role of a viral variant in this immune escape mechanism is not an inherent property of the variant, and changes based on the dynamic changes in the quasi-social ecosystem of the virus such as the emergence of a new viral variant or the merging of two intra-host viral populations through a viral transmission between a pair of infected hosts. We explore the interactions between altruistic variants shielding persistent variants from the host immune system and discover that each altruistic variant in a cross-immunoreactivity network operates independently from each other. Connections between altruistic variants change neither their qualitative roles, nor the quantitative values of the strength of persistent variants that they can shield from the host immune system. Variants strongly compete with each other to become persistent, and altruists have a maximal load for variants that they can shield from the host immune system. We also investigate cross-immunoreactivity networks derived from real Hepatitis-C patient data and find that acute and chronic phases of infection have significantly different in-degree and out-degree distributions and centralities.
-
ItemConvergence of Frame Series from Hilbert spaces to Banach spaces And l^1-boundedness(Georgia Institute of Technology, 2024-07-22) Yu, Pu-TingThis thesis consists of four parts. In the first part of the thesis, we study the convergence of frame expansions from separable Hilbert spaces to separable Banach spaces, and modulation spaces. A complete characterization of Schauder frames for which the associated frame expansions converge unconditionally for every alternative and every element will be presented. The second part is devoted to the conjecture proposed by Aldroubi et al. regarding the frame property of sequences generated by normal operators. By introducing a new notion named frame-normalizability, we provide several partial answers to this conjecture. We will focus on new notions named \ell^1-boundedness and \ell^1-frame-boundedness in the third part, where we consider the certain generalizations of Fourier coefficients from L^2(T) to separable Hilbert spaces. Instead of focusing on the exponential family we will work with bases that are topologically isomorphic to orthonormal bases and even frames. Two questions along with direction are of our main interest. First, under what circumstances is \ell^1-boundedness equivalent to \ell^1-frame-boundedness? Second, is the collection of \ell^1-bounded sets closed under sums and unions? For the first question, we will prove that several cases are true. For the second question, we will present several intriguing implications following from the assumptions that the collection of \ell^1-bounded sets closed under sums and unions. Finally, we will present some frames with special structures. For example, we will construct Gabor frames with atoms with poor time-frequency decay.
-
ItemOn Extremal, Algorithmic, and Inferential Problems in Graph Theory(Georgia Institute of Technology, 2024-07-19) Dhawan, AbhishekIn this dissertation we study a variety of graph-theoretic problems lying at the intersection of mathematics, computer science, and statistics. This work consists of three parts, each of which is in turn split into a number of chapters. While each part and the chapters therein are largely independent from each other, certain common themes feature throughout (most notably, the use of probabilistic techniques). In Part I, we consider graphs and hypergraphs satisfying certain structural constraints. We examine a celebrated conjecture of Alon, Krivelevich, and Sudakov regarding vertex coloring. Our results provide improved bounds in all known cases for which the conjecture holds. Additionally, we introduce a generalized notion of local sparsity and study the independence and chromatic numbers of graphs satisfying this property. We also consider multipartite hypergraphs, a natural extension of bipartite graphs to this more general setting. We show how certain probabilistic techniques applied to problems on bipartite graphs can be adapted to multipartite hypergraphs and are therefore able to extend and generalize a number of results. In Part II, we investigate edge-coloring from an algorithmic standpoint. We focus on multigraphs of bounded maximum degree, i.e., $\Delta(G) = O(1)$. Following the so-called augmenting subgraph approach, we design deterministic and randomized algorithms using a near optimal number of colors in the sequential setting as well as in the LOCAL model of distributed computing. Additionally, we study list-edge-coloring for list assignments satisfying certain local constraints, and describe a polynomial-time algorithm to compute such a coloring. Finally, in Part III, we explore a number of statistical inference problems in random hypergraph models. Specifically, we consider the statistical--computational gap of finding large independent sets in sparse random hypergraphs, and the computational threshold for the detection of planted dense subhypergraphs (a generalization of the classical planted clique problem). We are interested in the power and limitations of low-degree polynomial algorithms, a powerful class of algorithms which includes the class of local algorithms as well as the algorithmic paradigms of approximate message passing and power iteration.
-
ItemSelf-similarity approaches in image and signal(Georgia Institute of Technology, 2024-07-08) Cui, GuangyuSelf-similarity frequently appears in various types of data including images and signals. In particular, self-similarity means some local segment repeats itself in the entire data, where its distribution follows certain pattern. Mathematicians and scientists developed different models to understand the structural information of the given data. Some representatives includes texture synthesis, image denoising, point cloud denoising and change detection in signal. Self-similarity based methods are advantages over traditional ones through utilizing non-local information. In this work, I propose three self-similarity based models handling different tasks on image, signal and point cloud. In the first part I will discuss the texture edge detection problem. A patch consensus based model is introduced. The model utilizes the patch information and the neighbor information as a consensus to give a clear idea of the boundary by emphasizing the similarities and differences across textures. In the next topic, I propose a general framework of repeated pattern detection in signal. From a given signal, the proposed method uses intrinsic patch-wise self-similarity responses, and effi- ciently finds repeating patterns in a nested structure. At last, I will discuss the problem of constructing clean surface from noisy point cloud. The proposed method denoise the signed distance function (of point cloud) using a modified G-norm along the tangential direction of the input point cloud data. We apply Augmented Lagrangian Method to optimize the objective energy function and solve the subproblems. I present various numerical results to validate the proposed methods.
-
ItemRandom subsequences problems : asymptotics, variance, and quantum statistics.(Georgia Institute of Technology, 2024-05-07) Deslandes, ClementThis work considers some random words combinatorial problems and their applications. The starting point of this endeavor is the following question : given two random words, ”how much do they have in common” ? Even if this question has emerged independently in various fields, including computer science, biology, linguistics, it remains mostly unsolved. Firstly, we study the asymptotic distribution of the length of the longest common and increasing subsequences. There we consider a totally ordered alphabet with an order, say 1,...,m, and the subsequences are simply made of a block of 1’s, followed by a block of 2’s, ... and so on (such a subsequence is increasing, but not strictly). Secondly, we deal with the problem of the variance of the LCS. By introducing a general framework going beyond this problem, partial results in this direction are presented, and various upper and lower variance bounds are revisited in diverse settings. Lastly, we consider the Longest Increasing Subsequences (LIS) of one random word, and the surprising connection with quantum statistics.
-
ItemTopology, Geometry, and Combinatorics of Fine Curve Graphs(Georgia Institute of Technology, 2024-05-02) Shapiro, RobertaThe goal of this thesis is to explore curve graphs, which are combinatorial tools that encode topological information about surfaces. We focus on variants of the fine curve graph of a surface, which has its vertices essential simple closed curves on the surface and whose edges connect pairs of curves that are disjoint. We will prove various geometric, topological, and combinatorial results about these curve graph variants, including hyperbolicity (or lack thereof), contractibility of induced flag complexes, automorphism groups, and admissible induced subgraphs.