Master of Science in Computer Science

Series

Master of Science in Computer Science

Permanent Link

https://hdl.handle.net/1853/73328

Series Type

Degree Series

Associated Organization(s)

Organizational Unit

School of Computer Science

Full item page

Publication Search Results

Now showing 1 - 2 of 2

Performance understanding and tuning of iterative computation using profiling techniques

(Georgia Institute of Technology, 2010-05-18) Ozarde, Sarang Anil

Most applications spend a significant amount of time in the iterative parts of a computation. They typically iterate over the same set of operations with different values. These values either depend on inputs or values calculated in previous iterations. While loops capture some iterative behavior, in many cases such a behavior is spread over whole program sometimes through recursion. Understanding iterative behavior of the computation can be very useful to fine-tune it. In this thesis, we present a profiling based framework to understand and improve performance of iterative computation. We capture the state of iterations in two aspects 1) Algorithmic State 2) Program State. We demonstrate the applicability of our framework for capturing algorithmic state by applying it to the SAT Solvers and program state by applying it to a variety of benchmarks exhibiting completely parallelizable loops. Further, we show that such a performance characterization can be successfully used to improve the performance of the underlying application. Many high performance combinatorial optimization applications involve SAT solving. A variety of SAT solvers have been developed that employ different data structures and different propagation methods for converging on a fixed point for generating a satisfiable solution. The performance debugging and tuning of SAT solvers to a given domain is an important problem encountered in practice. Unfortunately not much work has been done to quantify the iterative efficiency of SAT solvers. In this work, we develop quantifiable measures for calculating convergence efficiency of SAT solvers. Here, we capture the Algorithmic state of the application by tracking the assignment of variables for each iteration. A compact representation of profile data is developed to track the rate of progress and convergence. The novelty of this approach is that it is independent of the specific strategies used in individual solvers, yet it gives key insights into the "progress" and "convergence behavior" of the solver in terms of a specific implementation at hand. An analysis tool is written to interpret the profile data and extract values of the following metrics such as: average convergence rate, efficiency of iteration and variable stabilization. Finally, using this system we produce a study of 4 well known SAT solvers to compare their iterative efficiency using random as well as industrial benchmarks. Using the framework, iterative inefficiencies that lead to slow convergence are identified. We also show how to fine-tune the solvers by adapting the key steps. We also show that the similar profile data representation can be easily applied to loops, in general, to capture their program state. One of the key attributes of the program state inside loops is their branch behavior. We demonstrate the applicability of the framework by profiling completely parallelizable loops (no cross-iteration dependence) and by storing the branching behavior of each iteration. The branch behavior across a group of iterations is important in devising the thread warps from parallel loops for efficient execution on GPUs. We show how some loops can be effectively parallelized on GPUs using this information.
An Optimization Framework for Embedded Processors with Auto-Modify Addressing Modes

(Georgia Institute of Technology, 2004-12-08) Lau, ChokSheak

Modern embedded processors with dedicated address generation unit support memory accesses using indirect addressing mode with auto-increment and auto-decrement. The auto-increment/decrement mode, if properly utilized, can save address arithmetic instructions, reduce static and dynamic footprint of the program and speed up the execution as well. We propose an optimization framework for embedded processors based on the auto-increment and decrement addressing modes for address registers. Existing work on this class of optimizations focuses on using an access graph and finding the maximum weight path cover to find an optimized stack variables layout. We take this further by using coalescing, addressing mode selection and offset registers to find further opportunities for reducing the number of load-address instructions required. We also propose an algorithm for building the layout with considerations for memory accesses across basic blocks, because existing work mainly considers intra-basic-block information. We then use the available offset registers to try to further reduce the number of address arithmetic instructions after layout assignment.

Series

Master of Science in Computer Science

Permanent Link

Series Type

Description

Associated Organization(s)

Associated Organization(s)

Filters

Author

Advisor

Date

Organization

Series

Resource Type

Resource Subtype

Has files

Record Type

Settings

Sort By

Results per page

Publication Search Results

Georgia Tech Library

Series Master of Science in Computer Science

Permanent Link

Series Type

Description

Associated Organization(s)

Associated Organization(s)

Filters

Author

Advisor

Date

Organization

Series

Resource Type

Resource Subtype

Has files

Record Type

Settings

Sort By

Results per page

Publication Search Results

Series

Master of Science in Computer Science