Learning from Program Context to Predict Effective Program Transformations
July 5, 2017 at 3:30pm CSE 305
Program transformations or program mutations are an essential component of many automated techniques in software engineering, such as automatic program repair, automatic test generation, and software diversification. The search space for possible program variants is huge and whether a program transformation is effective and generates a useful program variant depends on the application. For example, automatic program repair aims to generate non-equivalent program variants that improve functional correctness and mutation-based test generation aims to generate test cases that reveal non-equivalent yet hard-to-detect program variants. Conversely, software diversification aims to generate semantically equivalent program variants that, e.g., reduce vulnerability while preserving functional behavior. In this talk, I will show how program context, extracted from a program’s abstract syntax tree (AST), can predict effective program transformations. Specifically, I will show how syntactic and semantic features of the AST can be used to train a machine learning classifier that can predict whether a program transformation yields an equivalent or hard-to-detect program variant, both within and between projects. I will conclude with a discussion of how the prediction of effective program transformations can be used in the context of vulnerability detection and software diversification.
René Just is an Assistant Professor at the University of Massachusetts, Amherst. His research interests are in software engineering and software security, in particular static and dynamic program analysis, mobile security, mining software repositories, and applied machine learning. His research in the area of software engineering won several ACM SIGSOFT Distinguished Paper Awards, and he develops research infrastructures and tools (e.g., Defects4J and the Major mutation framework) that are widely used by other researchers.