Title:
Improving the Classification of Software Behaviors using Ensembles

dc.contributor.author Bowring, James Frederick
dc.contributor.author Harrold, Mary Jean
dc.contributor.author Rehg, James M.
dc.date.accessioned 2005-05-02T18:28:16Z
dc.date.available 2005-05-02T18:28:16Z
dc.date.issued 2005
dc.description.abstract One approach to the automatic classification of program behaviors is to view these behaviors as the collection of all the program's executions. Many features of these executions, such as branch profiles, can be measured, and if these features accurately predict behavior, we can build automatic behavior classifiers from them using statistical machine-learning techniques. Two key problems in the development of useful classifiers are (1) the costs of collecting and modeling data and (2) the adaptation of classifiers to new or unknown behaviors. We address the first problem by concentrating on the properties and costs of individual features and the second problem by using the active-learning paradigm. In this paper, we present our technique for modeling a data-flow feature as a stochastic process exhibiting the Markov property. We introduce the novel concept of databins to summarize, as Markov models, the transitions of values for selected variables. We show by empirical studies that databin-based classifiers are effective. We also describe ensembles of classifiers and how they can leverage their components to improve classification rates. We show by empirical studies that ensembles of control-flow and data-flow based classifiers can be more effective than either component classifier. en
dc.format.extent 319814 bytes
dc.format.mimetype application/pdf
dc.identifier.uri http://hdl.handle.net/1853/6010
dc.language.iso en_US
dc.publisher Georgia Institute of Technology en
dc.relation.ispartofseries CERCS;GIT-CERCS-05-10
dc.subject Active learning paradigms en
dc.subject Artificial intelligence en
dc.subject Automatic classification of program behaviors en
dc.subject Data bins en
dc.subject Experimentation en
dc.subject Machine learning en
dc.subject Markov models en
dc.subject Mathematics of computing en
dc.subject Measurement en
dc.subject Modeling en
dc.subject Probability and statistics en
dc.subject Program executions en
dc.subject Reliability en
dc.subject Software behavior en
dc.subject Software engineering en
dc.subject Software testing en
dc.subject Software/program verification en
dc.subject Stochastic processes en
dc.subject Verification en
dc.title Improving the Classification of Software Behaviors using Ensembles en
dc.type Text
dc.type.genre Technical Report
dspace.entity.type Publication
local.contributor.author Harrold, Mary Jean
local.contributor.author Rehg, James M.
local.contributor.corporatename Center for Experimental Research in Computer Systems
local.relation.ispartofseries CERCS Technical Report Series
relation.isAuthorOfPublication a81ec5a9-452c-4407-a97d-77364fcc8af2
relation.isAuthorOfPublication af5b46ec-ffe2-4ce4-8722-1373c9b74a37
relation.isOrgUnitOfPublication 1dd858c0-be27-47fd-873d-208407cf0794
relation.isSeriesOfPublication bc21f6b3-4b86-4b92-8b66-d65d59e12c54
Files
Original bundle
Now showing 1 - 1 of 1
Thumbnail Image
Name:
git-cercs-05-10.pdf
Size:
312.32 KB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.86 KB
Format:
Item-specific license agreed upon to submission
Description: