Organizational Unit:
School of Computational Science and Engineering

Research Organization Registry ID
Description
Previous Names
Parent Organization
Parent Organization
Organizational Unit
Includes Organization(s)

Publication Search Results

Now showing 1 - 3 of 3
  • Item
    Cooperation in Multi-Agent Reinforcement Learning
    (Georgia Institute of Technology, 2021-12-13) Yang, Jiachen
    As progress in reinforcement learning (RL) gives rise to increasingly general and powerful artificial intelligence, society needs to anticipate a possible future in which multiple RL agents must learn and interact in a shared multi-agent environment. When a single principal has oversight of the multi-agent system, how should agents learn to cooperate via centralized training to achieve individual and global objectives? When agents belong to self-interested principals with imperfectly-aligned objectives, how can cooperation emerge from fully-decentralized learning? This dissertation addresses both questions by proposing novel methods for multi-agent reinforcement learning (MARL) and demonstrating the empirical effectiveness of these methods in high-dimensional simulated environments. To address the first case, we propose new algorithms for fully-cooperative MARL in the paradigm of centralized training with decentralized execution. Firstly, we propose a method based on multi-agent curriculum learning and multi-agent credit assignment to address the setting where global optimality is defined as the attainment of all individual goals. Secondly, we propose a hierarchical MARL algorithm to discover and learn interpretable and useful skills for a multi-agent team to optimize a single team objective. Extensive experiments with ablations show the strengths of our approaches over state-of-the-art baselines. To address the second case, we propose learning algorithms to attain cooperation within a population of self-interested RL agents. We propose the design of a new agent who is equipped with the new ability to incentivize other RL agents and explicitly account for the other agents' learning process. This agent overcomes the challenging limitation of fully-decentralized training and generates emergent cooperation in difficult social dilemmas. Then, we extend and apply this technique to the problem of incentive design, where a central incentive designer explicitly optimizes a global objective only by intervening on the rewards of a population of independent RL agents. Experiments on the problem of optimal taxation in a simulated market economy demonstrate the effectiveness of this approach.
  • Item
    On Computation and Application of Optimal Transport
    (Georgia Institute of Technology, 2021-07-28) Xie, Yujia
    The Optimal Transport (OT) problem naturally arises in various machine learning problems, where one needs to align data from multiple sources. For example, the training data and application scenarios oftentimes have a domain gap, e.g., the training data is annotated photos collected in the daytime, yet the application scenario is in dark hours. In this case, we need to align the two datasets, so that the annotation information can be shared across them. During my Ph.D. study, I propose scalable algorithms for efficient OT computation, and its novel applications in end-to-end learning. For OT computation, I consider both discrete cases and continuous cases. For the discrete cases, I develop an Inexact Proximal point method for exact Optimal Transport problem (IPOT) with the proximal operator approximately evaluated at each iteration using projections to the probability simplex. The algorithm (a) converges to exact Wasserstein distance with theoretical guarantee and robust regularization parameter selection, (b) alleviates numerical stability issue, (c) has similar computational complexity to Sinkhorn, and (d) avoids the shrinking problem when apply to generative models. Furthermore, a new algorithm is proposed based on IPOT to obtain sharper Wasserstein barycenter. For the continuous cases, I propose an implicit generative learning-based framework called SPOT (Scalable Push-forward of Optimal Transport). Specifically, we approximate the optimal transport plan by a pushforward of a reference distribution, and cast the optimal transport problem into a minimax problem. We then can solve OT problems efficiently using primal dual stochastic gradient-type algorithms. To explore the connections between OT and end-to-end learning, I developed a differentiable top-k operator, and a differentiable permutation step. For the top-k operation, i.e., finding the k largest or smallest elements from a collection of scores, is an important model component used in information retrieval, machine learning, and data mining. However, if the top-k operation is implemented in an algorithmic way, e.g., using bubble algorithm, the resulting model cannot be trained in an end-to-end way using prevalent gradient descent algorithms. This is because these implementations typically involve swapping indices, whose gradient cannot be computed. Moreover, the corresponding mapping from the input scores to the indicator vector of whether this element belongs to the top-k set is essentially discontinuous. To address the issue, we propose a smoothed approximation, namely the SOFT (Scalable Optimal transport-based diFferenTiable) top-k operator. Specifically, our SOFT top-k operator approximates the output of the top-k operation as the solution of an Entropic Optimal Transport (EOT) problem. The gradient of the SOFT operator can then be efficiently approximated based on the optimality conditions of EOT problem. We apply the proposed operator to the k-nearest neighbors and beam search algorithms, and demonstrate improved performance. For the differentiable permutation step, I connect optimal transport to a variant of regression problem, where the correspondence between input and output data is not available. Such shuffled data is commonly observed in many real-world problems. Taking flow cytometry as an example, the measuring instruments may not be able to maintain the correspondence between the samples and the measurements. Due to the combinatorial nature of the problem, most existing methods are only applicable when the sample size is small, and limited to linear regression models. To overcome such bottlenecks, we propose a new computational framework -- ROBOT -- for the shuffled regression problem, which is applicable to large data and complex nonlinear models. Specifically, we reformulate the regression without correspondence as a continuous optimization problem. Then by exploiting the interaction between the regression model and the data correspondence, we develop a hypergradient approach based on differentiable programming techniques. Such a hypergradient approach essentially views the data correspondence as an operator of the regression, and therefore allows us to find a better descent direction for the model parameter by differentiating through the data correspondence. ROBOT can be further extended to the inexact correspondence setting, where there may not be an exact alignment between the input and output data. Thorough numerical experiments show that ROBOT achieves better performance than existing methods in both linear and nonlinear regression tasks, including real-world applications such as flow cytometry and multi-object tracking.
  • Item
    Learning dynamic processes over graphs
    (Georgia Institute of Technology, 2020-07-09) Trivedi, Rakshit
    Graphs appear as a versatile representation of information across domains spanning social networks, biological networks, transportation networks, molecular structures, knowledge networks, web information network and many more. Graphs represent heterogeneous information about the real-world entities and complex relationships between them in a very succinct manner. At the same time, graphs exhibit combinatorial, discrete and non- Euclidean properties in addition to being inherently sparse and incomplete which poses several challenges to techniques that analyze and study these graph structures. There exist various approaches across different fields spanning network science, game theory, stochastic process and others that provide excellent theoretical and analytical tools with interpretability benefits to analyze these networks. However, such approaches do not learn from data and make assumptions about real-world that capture only subset of properties. More importantly, they do not support predictive capabilities critical for decision making applications. In this thesis, we develop novel data driven learning approaches that incorporate useful inductive biases inspired from these classical approaches. The resulting learning approaches exhibit more general properties that go beyond conventional probabilistic assumptions and allow for building transferable and interpretable modules. We build these approaches anchored around two fundamental questions: (i) (Formation Pro- cess) How do these networks come into existence? and (ii) (Temporal Evolution Process) How do real-world networks evolve over time? First, we focus on the challenge of learning in a setting with highly sparse and in- complete knowledge graphs, where it is important to leverage multiple input graphs to sup- port accurate performance for variety of downstream applications such as recommendation, search and question-answering systems. Specifically, we develop a large-scale multi-graph deep relational learning framework that identifies entity linkage as a vital component of data fusion and learns to jointly perform representation learning and graph linkage across multiple graphs with applications to relational reasoning and knowledge construction. Next, we consider networks that evolve over time and propose a generative model of dynamic graphs that is useful to encode evolving network information into low-dimensional representations that facilitate accurate downstream event prediction tasks. Our approach relies on the coevolution principle of network structure evolution and network activities being tightly couple processes and develops a multi time scale temporal point process formulation parameterized by a recurrent architecture comprising of a novel Temporal Attention mechanism. Representation learning is posed as a latent mediation process – observed network processes evolve the state of nodes, while this node evolution governs future dynamics of observed processes and applied to downstream dynamic link prediction tasks and time prediction of future realizations (events) of both observed processes. Finally, we investigate the implication of adopting the optimization perspective of net- work formation mechanisms for building learning approaches for graph structured data. In this work, we first focus on global mechanisms that govern the formation of links in the network and build an inverse reinforcement learning based algorithm to jointly discover latent mechanisms directly from observed data, optimization of which enables a graph construction procedure capable of producing graphs with properties similar to observed data. Such an approach facilitates transfer and generalization properties and has been applied to variety of real-world graphs. In the last part, we consider the settings where the agents forming links are strategic and build a learnable model of network emergence games that jointly discovers the underlying payoff mechanisms and strategic profiles of agents from the data. This approach enables learning interpretable and transferable payoffs while the learned game as a model facilitates strategic prediction tasks, both of which are applied to several real world networks.