Use of third-party library APIs is pervasive, but can be error-prone. API-usage errors can be detected via static analysis if specifications of correct usage are available, but manually creating such specifications is a bottleneck. We showcase a semi-automated "big code" solution, where we use large code corpora to mine patterns in API usage, and ask human experts to perform analytics on those patterns to create static analysis rules.
The paper presents the extendable C/C++ whole-program call-graph tool MetaCG. We introduce its graph library, the Clang-based tool CGCollector to construct the call graph and attach meta information, and CGValidate to check for missing edges given a particular execution. MetaCG offers extendability through its metadata function-annotation mechanism to transfer information between tools. It preserves inheritance hierarchies and can be serialized into JSON. We evaluate CG-Collector’s ability to construct whole-program call-graphs for C/C++ code and, subsequently, present a performance profiler and a memory sanitizer that rely on MetaCG for whole-program call-graph information
Systems of transformations arise in many programming systems, such as in graphs of implicit type conversion functions. It is important to ensure that these diagrams commute: that composing any path of transformations from the same source to the same destination yields the same result. However, a straightforward approach to verifying commutativity must contend with cycles, and even so it runs in exponential time. Previous work has shown how to verify commutativity in the special case of acyclic diagrams in O(|V|4|E|2) time, but this is a batch algorithm: the entire diagram must be known ahead of time. We present an online algorithm that efficiently verifies that a commutative diagram remains commutative when adding a new edge. The new incremental algorithm runs in O(|V|2(|E| + |V|)) time. For the case when checking the equality of paths is expensive, we also present an optimization that runs in O(|V|4) time but reduces to the minimum possible number of equality checks. We implement the algorithms and compare them to batch baselines, and we demonstrate their practical application in the compiler of a domain-specific language for geometry types. To study the algorithms’ scalability to large diagrams, we apply them to discover discrepancies in currency conversion graphs.
Energy harvesting allows computational devices to run without a battery, opening new application domains of computing. Such devices work under an intermittent computing model, where the system may power cycle several times a second. To ensure progress, intermittent computing uses checkpoints, with much work being dedicated to this direction. However, no existing approaches handle programs using dynamically allocated memory in the intermittent computing model. We pose this as a challenge area, demonstrate the complexities of checkpointing in this space, and propose key characteristics of an effective solution.
Type checkers are the most commonly used form of static analysis, but their design is coupled to the rest of the language, making it hard or impossible to bring new kinds of reasoning to existing, unmodified code. We propose a novel approach to checking advanced type invariants and properties in unmodified source code, while approaching the speed and ease of simple, syntax directed type checkers.
The insight is that by combining a deep program analysis (symbolic execution) with a cheaper program abstraction (based on program slicing), it appears possible to reconstitute type-checking in the context of an underapproximate analysis. When the program's 'type level' can be opportunistically disentangled from the 'value level', this is done by the program abstraction step, in some cases removing the underapproximation.
We implement a simple prototype that demonstrates this idea by checking the safety of generic pointers in C, pointing to benefits such as safe homogeneous and heterogeneous generic data structures.