Do we need all the neurons we utilize? An exploration into costs and scalability for large-scale multi-relational graph learning

Author(s)
Sathidevi, Lakshmi
Editor(s)
Associated Organization(s)
Supplementary to:
Abstract
The first step in HW-SW Co-design is algorithm selection. The algorithm chosen places hard boundaries on the kind of efficiency and scalability that can be achieved through the design of a custom hardware accelerator. A smartly selected algorithm could enable several times more performance gains than a cutting-edge hardware accelerator designed for a poorly selected algorithm. This study presents Het#GNN, an unsupervised network embedding algorithm that extends on top of the #GNN algorithm to handle multi-relational graph learning. The goal is to provide a scalable, efficient, and low-latency solution for large-scale multi-relational graph learning. Graph Neural Network (GNN) models have shown promising results in various applications. However, these models are often hindered by high memory and computational requirements, limiting their applicability to large-scale real-world scenarios. The biomedical community has shown growing interest in more efficient and less heavily parameterized methods for handling large-scale multi-relational graph learning. Het#GNN is a simple yet powerful unsupervised algorithm that addresses the efficiency and scalability challenges faced by GNN algorithms. It enables a parameter-free approach to network embedding for large-scale multi-relational graphs. Het#GNN is able to compete with the best-performing models at less than a fraction of the runtime and power costs and Het#GNN achieves this on a consumer CPU, while all other methods utilize server GPUs to accelerate their algorithm. The need for a scalable, efficient, and low-latency graph learning algorithm is increasingly important in the biomedical research community. Decagon is a very popular work that applied the multi-relational GNN model R-GCN to polypharmacy drug side-effect prediction, obtaining strong results and medically relevant predictions. However, just one epoch of training for this model takes 36 hours on a Tesla P40 GPU. This kind of turnaround time is detrimental to the pace of biomedical research and discovery. Het#GNN is presented as a solution. It is applied to the Decagon dataset to demonstrate how sometimes we maybe utilizing too many time and energy expensive neurons where none may be required. By focusing on simple yet smart choices at the algorithmic level, Het#GNN offers considerable benefits in terms of efficiency, scalability, and latency compared to other best-performing methods without requiring high-end computational resources. With this work, it is hoped that this simple and accessible contribution enables biomedical researchers to accelerate the pace of their own biomedical research and discovery pipelines that involve large-scale multi-relational graph learning. The results of this work, however, are expected to be beneficial to other domains and areas of application as well, like real-time graph learning.
Sponsor
Date
2024-04-30
Extent
Resource Type
Text
Resource Subtype
Thesis
Rights Statement
Rights URI