Despite decades of research into parallelizing compiler technology, software parallelization remains a largely manual task where the key resource is expert time. In this paper we focus on the time-consuming task of identifying those loops in a program, which are both worthwhile and feasible to parallelize. We present a methodology and tool which make better use of expert time by guiding their effort directly towards those loops, where the largest performance gains can be expected while keeping analysis and transformation effort at a minimum.
We have developed a novel parallelization assistant that provides programmers with a ranking of all loops in a program based on their overall merit. For each loop this metric combines its potential contribution to speedup and an estimated probability for its successful parallelization. This probability is predicted using a machine learning model, which has been trained, validated, and tested on 1415 labelled loops, achieving a prediction accuracy greater than 90%.
We have evaluated our parallelization assistant against sequential C applications from the SNU NAS benchmark suite. We show that our novel methodology achieves parallel performance levels comparable to those from expert programmers while requiring less expert time. On average, our assistant reduces the number of lines of code that have to be inspected manually before reaching expert-level parallel speedup by 20%.
This paper explains a tool-based approach to detect sourcecode patterns that can be substituted with calls to the C++ standard template library (STL). The goal of the tool is to support developers in the process of refactoring a legacy code base to make use of modern library interfaces and standardized algorithms. This way, the intention of the programmer is encoded more explicitly in the code to increase readability. In addition, the STL is well tested, i.e., its use can improve robustness. We show early results from applying our tool to the High-Performance Conjugate Gradient (HPCG) benchmark. The current prototype creates roughly 50% false positives, all of which a human can easily identify.