Person:
Kim, Hyesoon

Associated Organization(s)
Organizational Unit
ORCID
ArchiveSpace Name Record

Publication Search Results

Now showing 1 - 2 of 2
  • Item
    Power Modeling for GPU Architecture Using McPAT
    (Georgia Institute of Technology, 2013) Lim, Jieun ; Lakshminarayana, Nagesh B. ; Kim, Hyesoon ; Song, William ; Yalamanchili, Sudhakar ; Sung, Wonyong
    Graphics Processing Units (GPUs) are very popular for both graphics and general-purpose applications. Since GPUs operate many processing units and manage multiple levels of memory hierarchy, they consume a significant amount of power. Although several power models for CPUs are available, the power consumption of GPUs has not been studied much yet. In this paper, we develop a new power model for GPUs by utilizing McPAT, a CPU power tool. We generate initial power model data from McPAT with a detailed GPU configuration, and then adjust the models by comparing them with empirical data.We use the NVIDIA’s Fermi architecture for building the power model, and our model estimates the GPU power consumption with an average error of 7.7% and 12.8% for the microbenchmarks and Merge benchmarks, respectively.
  • Item
    SD³: A Scalable Approach to Dynamic Data-Dependence Profiling
    (Georgia Institute of Technology, 2011) Kim, Minjang ; Lakshminarayana, Nagesh B. ; Kim, Hyesoon ; Chi-Keung Luk,
    As multicore processors are deployed in mainstream computing, the need for software tools to help parallelize programs is increasing dramatically. Data-dependence profiling is an important technique to exploit parallelism in programs. More specifically, manual or automatic parallelization can use the outcomes of data-dependence profiling to guide where to parallelize in a program. However, state-of-the-art data-dependence profiling techniques are not scalable as they suffer from two major issues when profiling large and long-running applications: (1) runtime overhead and (2) memory overhead. Existing data-dependence profilers are either unable to profile large-scale applications or only report very limited information. In this paper, we propose a scalable approach to data-dependence profiling that addresses both runtime and memory overhead in a single framework. Our technique, called SD³, reduces the runtime overhead by parallelizing the dependence profiling step itself. To reduce the memory overhead, we compress memory accesses that exhibit stride patterns and compute data dependences directly in a compressed format. We demonstrate that SD³ reduces the runtime overhead when profiling SPEC 2006 by a factor of 4.1⨯ and 9.7⨯ on eight cores and 32 cores, respectively. For the memory overhead, we successfully profile SPEC 2006 with the reference input, while the previous approaches fail even with the train input. In some cases, we observe more than a 20⨯ improvement in memory consumption and a 16⨯ speedup in profiling time when 32 cores are used.