AI-SEPS 2019- Proceedings of the 6th ACM SIGPLAN International Workshop on AI-Inspired and Empirical Methods for Software Engineering on Parallel Computing Systems

Full Citation in the ACM Digital Library

SESSION: Papers

“It looks like you’re writing a parallel loop”: a machine learning based parallelization assistant

Despite decades of research into parallelizing compiler technology, software parallelization remains a largely manual task where the key resource is expert time. In this paper we focus on the time-consuming task of identifying those loops in a program, which are both worthwhile and feasible to parallelize. We present a methodology and tool which make better use of expert time by guiding their effort directly towards those loops, where the largest performance gains can be expected while keeping analysis and transformation effort at a minimum.

We have developed a novel parallelization assistant that provides programmers with a ranking of all loops in a program based on their overall merit. For each loop this metric combines its potential contribution to speedup and an estimated probability for its successful parallelization. This probability is predicted using a machine learning model, which has been trained, validated, and tested on 1415 labelled loops, achieving a prediction accuracy greater than 90%.

We have evaluated our parallelization assistant against sequential C applications from the SNU NAS benchmark suite. We show that our novel methodology achieves parallel performance levels comparable to those from expert programmers while requiring less expert time. On average, our assistant reduces the number of lines of code that have to be inspected manually before reaching expert-level parallel speedup by 20%.

Automatic identification of standard template algorithms in raw loops

This paper explains a tool-based approach to detect sourcecode patterns that can be substituted with calls to the C++ standard template library (STL). The goal of the tool is to support developers in the process of refactoring a legacy code base to make use of modern library interfaces and standardized algorithms. This way, the intention of the programmer is encoded more explicitly in the code to increase readability. In addition, the STL is well tested, i.e., its use can improve robustness. We show early results from applying our tool to the High-Performance Conjugate Gradient (HPCG) benchmark. The current prototype creates roughly 50% false positives, all of which a human can easily identify.