Title:
Leveraging machine learning for enhancing code performance and programming productivity
Leveraging machine learning for enhancing code performance and programming productivity
Author(s)
Ye, Fangke
Advisor(s)
Sarkar, Vivek
Editor(s)
Collections
Supplementary to
Permanent Link
Abstract
As hardware performance continues to improve with the increase of hardware complexity and diversification, software struggles to keep up and fully realize these performance gains. Only a handful of expert programmers can harness the full potential of modern hardware using hardware-exposed low-level programming primitives. Meanwhile, the widespread adoption of high-level dynamically-typed programming languages like Python and JavaScript provides high productivity but suffers from low performance due to the lack of static type information necessary for compiler optimizations. Therefore, it becomes increasingly difficult to enable the development of high-performance programs capable of utilizing the potential performance provided by evolving hardware while maintaining high programming productivity for mass developers.
This thesis proposes the use of machine learning to enhance both programming productivity and code performance. First, we present a neural network based system that can compute code-semantics similarity in C/C++ code, with the goal of identifying semantically equivalent high-performance code for a given low-performance input code; this approach incorporates a context-aware semantics structure and an extensible neural code similarity scoring algorithm. Then, we show how a graph-based deep learning type inference method can be used to infer types in JavaScript to help productivity; our approach employs multiple graph neural network models and a novel type flow graph representation to infer types in dynamically-typed languages without manual annotations. Finally, we demonstrate a new approach to concrete type inference for Python programs, enabling ahead-of-time code optimization for dynamically-typed languages by combining machine learning and SMT solving without requiring programmers to provide any type annotation.
Sponsor
Date Issued
2024-04-25
Extent
Resource Type
Text
Resource Subtype
Dissertation