Generating and Analyzing Synthetic Workloads using Iterative Distillation

Thumbnail Image
Kurmas, Zachary Alan
Ramachandran, Umakishore
Associated Organization(s)
Organizational Unit
Supplementary to
The exponential growth in computing capability and use has produced a high demand for large, high-performance storage systems. Unfortunately, advances in storage system research have been limited by (1) a lack of evaluation workloads, and (2) a limited understanding of the interactions between workloads and storage systems. We have developed a tool, the Distiller that helps address both limitations. Our thesis is as follows: Given a storage system and a workload for that system, one can automatically identify a set of workload characteristics that describes a set of synthetic workloads with the same performance as the workload they model. These representative synthetic workloads increase the number of available workloads with which storage systems can be evaluated. More importantly, the characteristics also identify those workload properties that affect disk array performance, thereby highlighting the interactions between workloads and storage systems. This dissertation presents the design and evaluation of the Distiller. Specifically, our contributions are as follows. (1) We demonstrate that the Distiller finds synthetic workloads with at most 10% error for six out of the eight workloads we tested. (2) We also find that all of the potential error metrics we use to compare workload performance have limitations. Additionally, although the internal threshold that determines which attributes the Distiller chooses has a small effect on the accuracy of the final synthetic workloads, it has a large effect on the Distiller's running time. Similarly, (3) we find that we can reduce the precision with which we measure attributes and only moderately reduce the resulting synthetic workload's accuracy. Finally, (4) we show how to use the information contained in the chosen attributes to predict the performance effects of modifying the storage system's prefetch length and stripe unit size.
Date Issued
850106 bytes
Resource Type
Resource Subtype
Rights Statement
Rights URI