SLE 2023: Proceedings of the 16th ACM SIGPLAN International Conference on Software Language Engineering

Full Citation in the ACM Digital Library

SESSION: Papers

Exceptions all Over the Shop: Modular, Customizable, Language-Independent Exception Handling Layer

Walter Cazzola
Luca Favalli

The introduction of better abstractions is at the forefront of research and practice. Among many approaches, domain-specific languages are subject to an increase in popularity due to the need for easier, faster and more reliable application development that involves programmers and domain experts alike. To smooth the adoption of such a language-driven development process, researchers must create new engineering techniques for the development of programming languages and their ecosystems. Traditionally, programming languages are implemented from scratch and in a monolithic way. Conversely, modular and reusable language development solutions would improve maintainability, reusability and extensibility. Many programming languages share similarities that can be leveraged to reuse the same language feature implementations across several programming languages; recent language workbenches strive to achieve this goal by solving the language composition and language extension problems. Yet, some features are inherently complex and affect the behavior of several language features. Most notably, the exception handling mechanism involves varied aspects, such as the memory layout, variables, their scope, up to the execution of each statement that may cause an exceptional event—e.g., a division by zero. In this paper, we propose an approach to untangle the exception handling mechanism dubbed the exception handling layer: its components are modular and fully independent from one another, as well as from other language features. The exception handling layer is language-independent, customizable with regards to the memory layout and supports unconventional exception handling language features. To avoid any assumptions with regards to the host language, the exception handling layer is a stand-alone framework, decoupled from the exception handling mechanism offered by the back-end. Then, we present a full-fledged, generic Java implementation of the exception handling layer. The applicability of this approach is presented through a language evolution scenario based on a Neverlang implementation of JavaScript and LogLang, that we extend with conventional and unconventional exception handling language features using the exception handling layer, with limited impact on their original implementation.

An Executable Semantics for Faster Development of Optimizing Python Compilers

Olivier Melançon
Marc Feeley
Manuel Serrano

Python is a popular programming language whose performance is known to be uncompetitive in comparison to static languages such as C. Although significant efforts have already accelerated implementations of the language, more efficient ones are still required. The development of such optimized implementations is nevertheless hampered by its complex semantics and the lack of an official formal semantics. We address this issue by presenting an approach to define an executable semantics targeting the development of optimizing compilers. This executable semantics is written in a format that highlights type checks, primitive values boxing and unboxing, and function calls, which are all known sources of overhead. We also present semPy, a partial evaluator of our executable semantics that can be used to remove redundant operations when evaluating arithmetic operators. Finally, we present Zipi, a Python optimizing compiler prototype developed with the aid of semPy. On some tasks, Zipi displays performance competitive with that of state-of-the-art Python implementations.

Adaptive Structural Operational Semantics

Gwendal Jouneaux
Damian Frölich
Olivier Barais
Benoit Combemale
Gurvan Le Guernic
Gunter Mussbacher
L. Thomas van Binsbergen

Software systems evolve more and more in complex and changing environments, often requiring runtime adaptation to best deliver their services. When self-adaptation is the main concern of the system, a manual implementation of the underlying feedback loop and trade-off analysis may be desirable. However, the required expertise and substantial development effort make such implementations prohibitively difficult when it is only a secondary concern for the given domain. In this paper, we present ASOS, a metalanguage abstracting the runtime adaptation concern of a given domain in the behavioral semantics of a domain-specific language (DSL), freeing the language user from implementing it from scratch for each system in the domain. We demonstrate our approach on RobLANG, a procedural DSL for robotics, where we abstract a recurrent energy-saving behavior depending on the context. We provide formal semantics for ASOS and pave the way for checking properties such as determinism, completeness, and termination of the resulting self-adaptable language. We provide first results on the performance of our approach compared to a manual implementation of this self-adaptable behavior. We demonstrate, for RobLANG, that our approach provides suitable abstractions for specifying sound adaptive operational semantics while being more efficient.

A Reference GLL Implementation

Adrian Johnstone

The Generalised-LL (GLL) context-free parsing algorithm was introduced at the 2009 LDTA workshop, and since then a series of variant algorithms and implementations have been described. There is a wide variety of optimisations that may be applied to GLL, some of which were already present in the originally published form.

This paper presents a reference GLL implementation shorn of all optimisations as a common baseline for the real-world comparison of performance across GLL variants. This baseline version has particular value for non-specialists, since its simple form may be straightforwardly encoded in the implementer's preferred programming language.

We also describe our approach to low level memory management of GLL internal data structures. Our evaluation on large inputs shows a factor 3--4 speedup over a naïve implementation using the standard Java APIs and a factor 4--5 reduction in heap requirements. We conclude with notes on some algorithm-level optimisations that may be applied independently of the internal data representation.

Sharing Trees and Contextual Information: Re-imagining Forwarding in Attribute Grammars

Lucas Kramer
Eric Van Wyk

It is not uncommon to design a programming language as a core language with additional features that define some semantic analyses, but delegate others to their translation to the core. Many analyses require contextual information, such as a typing environment. When this is the same for a term under a new feature and under that feature's core translation, then the term (and computations over it) can be shared, with context provided by the translation. This avoids redundant, and sometimes exponential computations. This paper brings sharing of terms and specification of context to forwarding, a language extensibility mechanism in attribute grammars. Here context is defined by equations for inherited attributes that provide (the same) values to shared trees. Applying these techniques to the ableC extensible C compiler replaced around 80% of the cases in which tree sharing was achieved by a crude mechanism that prevented sharing context specifications and limited language extensibility. It also replaced all cases in which this mechanism was used to avoid exponential computations and allowed the removal of many, now unneeded, inherited attribute equations.

Nanopass Attribute Grammars

Nathan Ringo
Lucas Kramer
Eric Van Wyk

Compilers for feature-rich languages are complex; they perform many analyses and optimizations, and often lower complex language constructs into simpler ones. The nanopass compiler architecture manages this complexity by specifying the compiler as a sequence of many small transformations, over slightly different, but clearly defined, versions of the language that each perform a single straightforward action. This avoids errors that arise from attempting to solve multiple problems at once and allows for testing at each step.

Attribute grammars are ill-suited for this architecture, primarily because they cannot identify the many versions of the language in a non-repetitive and type-safe way. We present a formulation of attribute grammars that addresses these concerns, called nanopass attribute grammars, that (i) identifies a collection of all language constructs and analyses (attributes), (ii) concisely constructs specific (sub) languages from this set and transformations between them, and (iii) specifies compositions of transformations to form nanopass compilers. The collection of all features can be statically typed and individual languages can be checked for well-definedness and circularity. We evaluate the approach by implementing a significant subset of the Go programming language in a prototype nanopass attribute grammar system.

Automated Extraction of Grammar Optimization Rule Configurations for Metamodel-Grammar Co-evolution

Weixing Zhang
Regina Hebig
Daniel Strüber
Jan-Philipp Steghöfer

When a language evolves, meta-models and associated gram- mars need to be co-evolved to stay mutually consistent. Previous work has supported the automated migration of a grammar after changes of the meta-model to retain manual optimizations of the grammar, related to syntax aspects such as keywords, brackets, and component order. Yet, doing so required the manual specification of optimization rule con- figurations, which was laborious and error-prone. In this work, to significantly reduce the manual effort during meta-model and grammar co-evolution, we present an automated approach for extracting optimization rule configurations. The inferred configurations can be used to automatically replay optimizations on later versions of the grammar, thus leading to a fully automated migration process for the supported types of changes. We evaluated our approach on six real cases. Full automation was possible for three of them, with agreement rates between ground truth and inferred grammar between 88% and 67% for the remaining ones.

Reuse and Automated Integration of Recommenders for Modelling Languages

Lissette Almonte
Antonio Garmendia
Esther Guerra
Juan de Lara

Many recommenders for modelling tasks have recently appeared. They use a variety of recommendation methods,tailored to concrete modelling languages. Typically, recommenders are created as independent programs, and subsequently need to be integrated within a modelling tool, incurring in high development effort. Moreover, it is currently not possible to reuse a recommender created for a modelling language with a different notation, even if they are similar. To attack these problems, we propose a methodology to reuse and integrate recommenders into modelling tools. It considers four orthogonal dimensions: the target modelling language, the tool, the recommendation source, and the recommended items. To make homogeneous the access to arbitrary recommenders, we propose a reference recommendation service that enables indexing recommenders, investigating their properties, and obtaining recommendations likely coming from several sources. Our methodology is supported by IronMan, an Eclipse plugin that automates the integration of recommenders within Sirius and tree-based editors, and can bridge recommenders created for a modelling language for their reuse with a different one. We evaluate the power of the tool by reusing 2 recommenders for 4 different languages, and integrating them into 6 modelling tools.

GPT-3-Powered Type Error Debugging: Investigating the Use of Large Language Models for Code Repair

Francisco Ribeiro
José Nuno Castro de Macedo
Kanae Tsushima
Rui Abreu
João Saraiva

Type systems are responsible for assigning types to terms in programs. That way, they enforce the actions that can be taken and can, consequently, detect type errors during compilation. However, while they are able to flag the existence of an error, they often fail to pinpoint its cause or provide a helpful error message. Thus, without adequate support, debugging this kind of errors can take a considerable amount of effort. Recently, neural network models have been developed that are able to understand programming languages and perform several downstream tasks. We argue that type error debugging can be enhanced by taking advantage of this deeper understanding of the language’s structure. In this paper, we present a technique that leverages GPT-3’s capabilities to automatically fix type errors in OCaml programs. We perform multiple source code analysis tasks to produce useful prompts that are then provided to GPT-3 to generate potential patches. Our publicly available tool, Mentat, supports multiple modes and was validated on an existing public dataset with thousands of OCaml programs. We automatically validate successful repairs by using Quickcheck to verify which generated patches produce the same output as the user-intended fixed version, achieving a 39% repair rate. In a comparative study, Mentat outperformed two other techniques in automatically fixing ill-typed OCaml programs.

Temporal Breakpoints for Multiverse Debugging

Matthias Pasquier
Ciprian Teodorov
Frédéric Jouault
Matthias Brun
Luka Le Roux
Loïc Lagadec

Multiverse debugging extends classical and omniscient debugging to allow the exhaustive exploration of non-deterministic and concurrent systems during debug sessions. The introduction of user-defined reductions significantly improves the scalability of the approach. However, the literature fails to recognize the importance of using more expressive logics, besides local-state predicates, to express breakpoints. In this article, we address this problem by introducing temporal breakpoints for multiverse debugging. Temporal breakpoints greatly enhance the expressivity of conditional breakpoints, allowing users to reason about the past and future of computations in the multiverse. Moreover, we show that it is relatively straightforward to extend a language-agnostic multiverse debugger semantics with temporal breakpoints, while preserving its generality. To show the elegance and practicability of our approach, we have implemented a multiverse debugger for the AnimUML modeling environment that supports 3 different temporal breakpoint formalisms: regular-expressions, statecharts, and statechart-based Büchi automata.

Cross-Level Debugging for Static Analysers

Mats Van Molle
Bram Vandenbogaerde
Coen De Roover

Static analyses provide the foundation for several tools that help developers find problems before executing the program under analysis. Common applications include warning about unused code, deprecated API calls, or about potential security vulnerabilities within an IDE. A static analysis distinguishes itself from a dynamic analysis in that it is supposed to terminate even if the program under analysis does not. In many cases it is also desired for the analysis to be sound, meaning that its answers account for all possible program behavior. Unfortunately, analysis developers may make mistakes that violate these properties resulting in hard-to-find bugs in the analysis code itself. Finding these bugs can be a difficult task, especially since analysis developers have to reason about two separate code-bases: the analyzed code and the analysis implementation. The former is usually where the bug manifests itself, while the latter contains the faulty implementation. A recent survey has found that analysis developers prefer to reason about the analyzed program, indicating that debugging would be easier if debugging features such as (conditional) breakpoints and stepping were also available in the analyzed program. In this paper, we therefore propose cross-level debugging for static analysis. This novel technique moves debugging features such as stepping and breakpoints to the base-layer (i.e., analyzed program), while still making interactions with the meta-layer (i.e., analysis implementation) possible. To this end, we introduce novel conditional breakpoints that express conditions, which we call meta-predicates, about the current analysis’ state. We integrated this debugging technique in a framework for implementing modular abstract interpretation-based static analyses called MAF. Through a detailed case study on 4 real-world bugs taken from the repository of MAF, we demonstrate how cross-level debugging helps analysis developers in locating and solving bugs.

Cascade: A Meta-language for Change, Cause and Effect

Riemer van Rozen

Live programming brings code to life with immediate and continuous feedback. To enjoy its benefits, programmers need powerful languages and live programming environments for understanding the effects of code modifications on running programs. Unfortunately, the enabling technology that powers these languages, is missing. Change, a crucial enabler for explorative coding, omniscient debugging and version control, is a potential solution. We aim to deliver generic solutions for creating these languages, in particular Domain-Specific Languages (DSLs). We present Cascade, a meta-language for expressing DSLs with interface- and feedback-mechanisms that drive live programming. We demonstrate run-time migrations, ripple effects and live desugaring of three existing DSLs. Our results show that an explicit representation of change is instrumental for how these languages are built, and that cause-and-effect relationships are vital for delivering precise feedback.

Seamless Code Generator Synchronization in the Composition of Heterogeneous Modeling Languages

Nico Jansen
Bernhard Rumpe

In Software Language Engineering, the composition of heterogeneous languages has become an increasingly relevant research area in recent years. Despite considerable advances in different composition techniques, they mainly focus on composing concrete and abstract syntax, while a thorough yet general concept for synchronizing code generators and their produced artifacts is still missing. Current solutions are either highly generic, typically increasing the complexity beyond their actual value, or strictly limited to specific applications. In this paper, we present a concept for lightweight generator composition, using the symbol tables of heterogeneous modeling languages to exchange generator-specific accessor and mutator information. The information is attached to the symbols of model elements via templates allowing code generators to communicate access routines at the code level without a further contract. Providing suitable synchronization techniques for code generation is essential to enable language composition in all aspects.

Enabling Blended Modelling of Timing and Variability in EAST-ADL

Muhammad Waseem Anwar
Federico Ciccozzi
Alessio Bucaioni

EAST-ADL is a domain-specific modelling language for the design and analysis of vehicular embedded systems. Seamless modelling through multiple concrete syntaxes for the same language, known as blended modelling, offers enhanced modelling flexibility to boost collaboration, lower modelling time, and maximise the productivity of multiple diverse stakeholders involved in the development of complex systems, such as those in the automotive domain. Together with our industrial partner, which is one of the leading contributors to the definition of EAST-ADL and one of its main end-users, we provided prototypical blended modelling features for EAST-ADL. In this article, we report on our language engineering work towards the provision of blended modelling for EAST-ADL to support seamless graphical and textual notations. Notably, for selected portions of the EAST-ADL language (i.e., timing and variability packages), we introduce ad-hoc textual concrete syntaxes to represent the language's abstract syntax in alternative textual notations, preserving the language's semantics. Furthermore, we propose a full-fledged runtime synchronisation mechanism, based on the standard EAXML schema format, to achieve seamless change propagation across the two notations. As EAXML serves as a central synchronisation point, the proposed blended modelling approach is workable with most existing EAST-ADL tools. The feasibility of the proposed approach is demonstrated through a car wiper use case from our industrial partner - Volvo. Results indicate that the proposed blended modelling approach is effective and can be applied to other EAST-ADL packages and supporting tools.

Towards Efficient Model Comparison using Automated Program Rewriting

Qurat ul ain Ali
Dimitris Kolovos
Konstantinos Barmpis

Model comparison is a prerequisite task for several other model management tasks such as model merging, model differencing etc. We present a novel approach to efficiently compare models using programs written in a rule-based model comparison language. As the comparison is done at the model element level, and each element needs to be traversed and compared with its corresponding elements, the execution of these comparison algorithms can be computationally expensive for larger models. In this paper, we present an efficient comparison approach which provides an automated rewriting facility to compare (both homogeneous and heterogeneous) models, based on static program analysis. Using this analysis, we reduce the search space by pre-filtering/indexing model elements, before actually comparing them. Moreover, we reorder the comparison match rules according to the dependencies between these rules to reduce the cost of jumping between rules. Our experiments demonstrate that the proposed model comparison approach delivers significant performance benefits in terms of execution time compared to the default ECL execution engine.

Deriving Integrated Multi-Viewpoint Modeling Languages from Heterogeneous Modeling Languages: An Experience Report

Malte Heithoff
Nico Jansen
Jörg Christian Kirchhof
Judith Michael
Florian Rademacher
Bernhard Rumpe

In modern systems engineering, domain experts increasingly utilize models to define domain-specific viewpoints in a highly interdisciplinary context. Despite considerable advances in developing model composition techniques, their integration in a largely heterogeneous language landscape still poses a challenge. Until now, composition in practice mainly focuses on developing foundational language components or applying language composition in smaller scenarios, while the application to extensive, heterogeneous languages is still missing. In this paper, we report on our experiences of composing sophisticated modeling languages using different techniques simultaneously in the context of heterogeneous application areas such as assistive systems and cyber-physical systems in the Internet of Things. We apply state-of-the-art practices, show their realization, and discuss which techniques are suitable for particular modeling scenarios. Pushing model composition to the next level by integrating complex, heterogeneous languages is essential for establishing modeling languages for highly interdisciplinary development teams.

A Low-Code Platform for Systematic Component-Oriented Language Composition

Jérôme Pfeiffer
Andreas Wortmann

Low-code platforms have gained popularity for accelerating complex software engineering tasks through visual interfaces and pre-built components. Software language engineering, specifically language composition, is such a complex task requiring expertise in composition mechanisms and language workbenches multi-dimensional language constituents (syntax and semantics). This paper presents an extensible low-code platform with a graphical web-based interface for language composition. It enables composition using language components, facilitating systematic composition within language families promoting reuse and streamlining the management, composition, and derivation of domain-specific languages.

A Tool for the Definition and Deployment of Platform-Independent Bots on Open Source Projects

Adem Ait
Javier Luis Cánovas Izquierdo
Jordi Cabot

The development of Open Source Software (OSS) projects is a collaborative process that heavily relies on active contributions by passionate developers. Creating, retaining and nurturing an active community of developers is a challenging task; and finding the appropriate expertise to drive the development process is not always easy. To alleviate this situation, many OSS projects try to use bots to automate some development tasks, thus helping community developers to cope with the daily workload of their projects. However, the techniques and support for developing bots is specific to the code hosting platform where the project is being developed (e.g., GitHub or GitLab). Furthermore, there is no support for orchestrating bots deployed in different platforms nor for building bots that go beyond pure development activities. In this paper, we propose a tool to define and deploy bots for OSS projects, which besides automation tasks they offer a more social facet, improving community interactions. The tool includes a Domain-Specific Language (DSL) which allows defining bots that can be deployed on top of several platforms and that can be triggered by different events (e.g., creation of a new issue or a pull request). We describe the design and the implementation of the tool, and illustrate its use with examples.

Online Name-Based Navigation for Software Meta-languages

Peter D. Mosses

Software language design and implementation often involve specifications written in various esoteric meta-languages. Language workbenches generally include support for precise name-based navigation when browsing language specifications locally, but such support is lacking when browsing the same specifications online in code repositories.

This paper presents a technique to support precise name-based navigation of language specifications in online repositories using ordinary web browsers. The idea is to generate hyperlinked twins: websites where verbatim copies of specification text are enhanced with hyperlinks between name references and declarations. By generating hyperlinks directly from the name binding analysis used internally in a language workbench, online navigation in hyperlinked twins is automatically consistent with local navigation.

The presented technique has been implemented for the Spoofax language workbench, and used to generate hyperlinked twin websites from various language specifications in Spoofax meta-languages. However, the applicability of the technique is not limited to Spoofax, and developers of other language workbenches could presumably implement similar tooling, to make their language specifications more accessible to those who do not have the workbench installed.

Practical Runtime Instrumentation of Software Languages: The Case of SciHook

Dorian Leroy
Benoit Combemale
Benoît Lelandais
Marie-Pierre Oudot

Software languages have pros and cons, and are usually chosen accordingly. In this context, it is common to involve different languages in the development of complex systems, each one specifically tailored for a given concern. However, these languages create de facto silos, and offer little support for interoperability with other languages, be it statically or at runtime. In this paper, we report on our experiment on extracting a relevant behavioral interface from an existing language, and using it to enable interoperability at runtime. In particular, we present a systematic approach to define the behavioral interface and we discuss the expertise required to define it. We illustrate our work on the case study of SciHook, a C++ library enabling the runtime instrumentation of scientific software in Python. We present how the proposed approach, combined with SciHook, enables interoperability between Python and a domain-specific language dedicated to numerical analysis, namely NabLab, and discuss overhead at runtime.