Onward! '25: Proceedings of the 2025 ACM SIGPLAN International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software

Full Citation in the ACM Digital Library

SESSION: Papers

Synchronous Programming for Kids: A Manifesto

Jean Pichon-Pharabod

Many primary school age children find developing games and interactive animations very motivating, and this has been successfully leveraged by several educational programming environments. However, most of these environments, like Scratch and Microsoft MakeCode, are fundamentally based on imperative, Pascal-style programming languages with ad-hoc support for concurrency and events. I argue that this style of programming language, and the burden of encoding state machines that it imposes, does not match the children's intuitions, and is therefore discouraging. Instead, I propose re-founding these tools around synchronous programming languages like Esterel, which, by being centered around the concepts of time, sequential composition, concurrency, signalling, and preemption, match the children's intuitions. I describe a prototype implementation, and argue how using such a programming language would enable schoolchildren to express their intuitions more directly.

Semantics-Preserving Transformation of Context-Free Grammars into LL(1) Form

Vladyslav Bilyk

Transformation of context-free grammars into LL(1) form enables construction of simple and efficient parsers. However, if semantics are overlooked, the transformation is likely to result in a complex grammar that needs to be manually aligned with the original semantics. This work presents a method of semantics-preserving grammar transformation into LL(1) form using a representation based on trees. Grammar rules and their semantics are modelled as typed tree nodes, which enables elimination of LL(1) conflicts while retaining the original semantics. Transformed grammars can be used to construct continuation-passing style parsers that produce parse trees matching the original grammar. It is shown how this approach can be applied to the design of domain-specific languages in Java, resulting in size and type complexity linear in the size of the transformed grammar.

ScooPy: Enhancing Program Synthesis with Nested Example Specifications

Tomer Katz
Hila Peleg

Current IDE-integrated program synthesis leaves no indication of what code was auto-generated, let alone an explanation of why. This makes both identifying and understanding machine-generated code hard. We therefore add example scopes, comments enclosing synthesized code that document the input-output examples that created it. This also allows programmers to manually edit examples and re-launch the synthesizer without tediously re-entering the examples. Scopes are simply text, and so can be created anywhere, including inside other scopes. However, synthesizers can only reason about one flat example set. To address this, we introduce ScooPy, IDE-integrated program synthesis for nested example specifications. ScooPy lets programmers edit example scopes, see live information based on the examples, and call the synthesizer on nested scopes. In two user studies with 6 and 16 participants we see that example scopes increase users’ engagement with the code and that ScooPy improves users’ ability to synthesize for some types of tasks.

Foundational Design Principles and Patterns for Building Robust and Adaptive GenAI-Native Systems

Frederik Vandeputte

Generative AI (GenAI) has emerged as a transformative technology, demonstrating remarkable capabilities across diverse application domains. However, GenAI faces several major challenges in developing reliable and efficient GenAI-empowered systems due to its unpredictability and inefficiency. This paper advocates for a paradigm shift: future GenAI-native systems should integrate GenAI's cognitive capabilities with traditional software engineering principles to create robust, adaptive, and efficient systems.

We introduce foundational GenAI-native design principles centered around five key pillars---reliability, excellence, evolvability, self-reliance, and assurance---and propose architectural patterns such as GenAI-native cells, organic substrates, and programmable routers to guide the creation of resilient and self-evolving systems. Additionally, we outline the key ingredients of a GenAI-native software stack and discuss the impact of these systems from technical, user adoption, economic, and legal perspectives, underscoring the need for further validation and experimentation. Our work aims to inspire future research and encourage relevant communities to implement and refine this conceptual framework.

Sharing Is Scaring: Linking Cloud File-Sharing to Programming Language Semantics

Skyler Austen
Shriram Krishnamurthi
Kathi Fisler

Users often struggle with cloud file-sharing applications. Problems appear to arise not only from interface flaws, but also from misunderstanding the underlying semantics of operations like linking, attaching, downloading, and editing. We argue that these difficulties echo long-standing challenges in understanding concepts in programming languages like aliasing, copying, and mutation.

We begin to examine this connection through a formative user study investigating general users’ understanding of file sharing. Our study casts known misconceptions from the programming-education literature into semantically-similar cloud file-sharing tasks. It also uses tasks that echo two kinds of analyses used in programming-education: tracing and programming. Our findings reveal widespread misunderstandings across several tasks.

We also develop a formal semantics of cloud file-sharing operations, reflecting copying, referencing, and mutating shared content. By explicating the semantics, we aim to provide a formal foundation for improving mental models, educational tools, and automated assistance. This semantics can support applications including trace checking, workflow synthesis, and interactive feedback.

An Argument for the Practicality of Entity Component Systems as the Primary Data Structure for an Interpreter or Compiler

Joshua Dahl
Frederick C. Harris

In this paper, we examine how Entity Component Systems, a data structure that has been gaining popularity in game engines, can benefit compiler and interpreter design. ECS does not consistently provide the same performance benefits that games utilize it for; however, it does make writing optimization passes easier. Additionally, its dogmatic focus on simple Plain-Old-Data structures makes serialization much easier. These benefits do come at a memory cost, the severity of which we still need to compare against more mature language implementations.

X-by-Construction: Towards Ensuring Non-functional Properties in by-Construction Engineering

Maximilian Kodetzki
Tabea Bordis
Alex Potanin
Ina Schaefer

Correctness-by-Construction engineering (CbC) is a refinement-based approach to develop functionally correct programs based on a formal specification. By correctly applying refinement rules during development, CbC enables detection of bugs during program construction, unlike post-hoc verification, which proves correctness only after implementation. Support for CbC engineering for non-functional properties is summarized under the term X-by-Construction (XbC). However, current XbC approaches are limited to information flow properties, leaving other non-functional properties of software quality, such as performance or reliability, unsupported. To address this gap, we present our vision for generalizing XbC to integrate non-functional properties into by-Construction engineering. In this way, we leverage the development of high-quality software through a refinement-based approach for future software engineering. With that, it will become possible to develop software ensuring that it not only exhibits functional correctness, but also non-functional guarantees by construction. Further, we propose ideas for ensuring energy efficiency in by-Construction engineering. We assess what it needs to integrate non-functional properties into by-Construction engineering and discuss arising challenges.

On Collective Control over User Interfaces in the Face of Network Effects

Caleb Malchik
Joan Feigenbaum

User interfaces for accessing modern Internet services are increasingly defined by the services themselves, leaving users with little control over the interfaces they use on a daily basis. This architecture and the economic drive to monetize users’ activity has lead to a status quo that falls short in key areas of user concern, with interfaces exhibiting undesirable features such as dark patterns, distracting ads, and pop-ups. The Platform for Content-Structure Inference (PCSI, pronounced “pixie”) supports new applications that resist these features by giving users control over their interfaces and allows new applications to overcome network effects by integrating existing web content. This strategy of client-driven interoperability is demonstrated in the news reader application PCSINews. In concert with user organizations such as data co-ops, PCSI could enable evolutionary advancement toward a better Internet characterized by a fairer balance of power between users and service providers.

Exploring the Design Space for Runtime Enforcement of Dynamic Capabilities

Andrew Fawcet
James Noble
Michael Homer

Dala is an "as simple as possible" concurrent object-oriented language designed to avoid data races. Dala objects come in three safe flavours: immutable, isolated, and local, plus a fourth unsafe flavour. The objects are organised into an hierarchy so that e.g. immutable objects can be accessed from anywhere but never mutated, while thread local objects can be mutated but cannot be accessed outside their containing thread.

Dala's flavours are intended to be enforced at runtime: unfortunately it is not clear when and how best to undertake that enforcement.

In this paper we present six axes of variation for the dynamic enforcement of Dala flavours: when each safe flavour is enforced, how isolated objects are moved, how new objects are assigned a flavour, and whether objects' flavours can change over time. These six axes embody 2,880 different combinations: we present five exemplary designs and discuss how they are placed on those those axes. Programming language designers can use our analysis to inform the design of dynamic capabilities in their languages, while the analysis may also help programmers understand what language designers have done.

Literate Tracing

Matthew Sotoudeh

As computer systems grow ever larger and more complex, a crucial task in software development is for one person (the system expert) to communicate to another (the system novice) how a certain program works. This paper reports on the author’s experiences with a paradigm for program documentation that we call literate tracing. A literate trace explains a software system using annotated, concrete execution traces of the system. Literate traces complement both in-code comments (which often lack global context) and out-of-band design docs (which often lack a concrete connection to the code). We also describe TReX, our tool for making literate traces that are interactive, visual, and guaranteed by construction to be faithful to the program semantics. We have used TReX to write literate traces explaining components of large systems software including the Linux kernel, Git source control system, and GCC compiler.

TideScript: A Domain Specific Language for Peptide Chemistry

Nicholas Morris
Blair Archibald
S. Hessam M. Mehr

Synthetic peptides are becoming increasingly important in industry and academia, including applications in peptide-based therapeutics. While Solid Phase Peptide Synthesis (SPPS) is largely automated, current software options for describing this process are instrument-specific, with no unifying standard protocol specification. Post-synthesis modification of peptides in the solution phase is also performed manually. As a result, discovery and process optimisation remains an iterative and manual process, requiring large amounts of human intervention and labour. Here we introduce TideScript: a domain specific language for peptide chemistry. TideScript aims to provide a programming language to standardize protocol descriptions, allowing unambiguous specification, increased automation, and autonomous optimization. At present, TideScript is a formal specification language, with scope to expand into full execution on hardware. We give a full description of TideScript, including its syntax and a small-step operational semantics, and show how TideScript can encode three common peptide protocols: peptide chain assembly by SPPS, peptide cleavage and thiol-maleimide conjugation. We compare this approach to existing non-peptide-specific chemical protocol languages, χDL and BioScript.

What You See Is What It Does: A Structural Pattern for Legible Software

Eagon Meng
Daniel Jackson

The opportunities offered by LLM coders (and their current limitations) demand a reevaluation of how software is structured. Software today is often “illegible”—lacking a direct correspondence between code and observed behavior—and insufficiently modular, leading to a failure of three key requirements of robust coding: incrementality (the ability to deliver small increments by making localized changes), integrity (avoiding breaking prior increments) and transparency (making clear what has changed at build time, and what actions have happened at runtime).

A new structural pattern offers improved legibility and modularity. Its elements are concepts and synchronizations: fully independent services and event-based rules that mediate between them. A domain-specific language for synchronizations allows behavioral features to be expressed in a granular and declarative way (and thus readily generated by an LLM). A case study of the RealWorld benchmark is used to illustrate and evaluate the approach.

SESSION: Essays

Carving Text at Its Joints: A New Perspective on Writing and Computers

Kevin Graaf

Multilingual computer text handling appears at present unapproachably difficult. Despite considerable progress made over the past three decades, major open problems remain in encoding, interactively editing, programmatically transforming, and displaying text in many of the world's writing systems. These challenges are especially pronounced with regard to ancient and minority or minoritized writing systems and are exacerbated by antiquated, retrofitted software and standards. In this paper, I offer a genealogical critique of how computer scientists have approached human writing hitherto, a new, clarifying perspective on writing as it relates to computers, and a sketch of how the problems I identify herein could be addressed.

Gauguin, Descartes, Bayes: A Diurnal Golem’s Brain

Kartik Chandra
Amanda Liu
Jonathan Ragan-Kelley
Joshua B. Tenenbaum

A "quine" is a deterministic program that prints itself. In this essay, I will show you a "gauguine": a probabilistic program that infers itself. A gauguine is repeatedly asked to guess its own source code. Initially, its chances of guessing correctly are of course minuscule. But as the gauguine observes more and more of its own previous guesses, it detects patterns of behavior and gains information about its inner workings. This information allows it to bootstrap self-knowledge, and ultimately discover its own source code. We will discuss how—and why—we might write a gauguine, and what we stand to learn by constructing one.

Let’s Take Esoteric Programming Languages Seriously

Jeremy Singer
Steve Draper

Esoteric programming languages are challenging to learn, but their unusual features and constraints may serve to improve programming ability. From languages designed to be intentionally obtuse (e.g. INTERCAL) to others targeting artistic expression (e.g. Piet) or exploring the nature of computation (e.g. Fractan), there is rich variety in the realm of esoteric programming languages. This essay examines the counterintuitive appeal of esoteric languages and seeks to analyse reasons for this popularity. We will explore why people are attracted to esoteric languages in terms of (a) program comprehension and construction, as well as (b) language design and implementation. Our assertion is that esoteric languages can improve general PL awareness, at the same time as enabling the esoteric programmer to impress their peers with obscure knowledge. We will also consider pedagogic principles and the use of AI, in relation to esoteric languages. Emerging from the specific discussion, we identify a general set of 'good' reasons for designing new programming languages. It may not be possible for anyone to be exhaustive on this topic, and it is certain we have not achieved that goal here. However we believe our most important contribution is to draw more attention to the varied and often implicit motivations involved in programming language design.

The Unix Executable as a Smalltalk Method: And Its Implications for Unix-Smalltalk Unification

Joel Jakubovic

Unix and Smalltalk are very different in the details, but bear curious similarities in their broad outlines. Prior work has made these comparisons at a high level and sketched a path for retrofitting Smalltalk's advantages onto Unix (without compromising the advantages of the latter). Everybody seems to agree on identifying the Unix file with the Smalltalk object, but this still leaves much unspecified. I argue that we should identify the Unix executable with the Smalltalk method. A Smalltalk VM implementation via the filesystem falls out quite easily from this premise; however, the severe overhead associated with Unix processes casts doubt on its practical realisation. Nevertheless, we can see several ways around this problem. The connection shows promise for realising the benefits of Smalltalk within Unix without sequestering the former in a hermetically sealed image and VM.

The Proof Must Go On: Formal Methods in the Theater of Secure Software Development of the Future

Charles Averill

Formal Methods (FM) will not arrive with fanfare. It will spread quietly, not as a revolution, but as a patch: reviewed, merged, and dismissed. In the coming century, the decades-old practice of simply testing code will collapse under the weight of cyberattacks and development complexity.

It is succeeded by a slower, more methodical approach as FM becomes infrastructure. Verified C libraries thread their way into the systems that run the internet. Model checkers embed themselves in CI pipelines. Modern type systems reach the masses. AI and IO remain stubbornly unverifiable and insecure, forcing industry to work towards a deeper form of trust. Meanwhile, a generation of developers learns to write specifications not as a niche academic exercise, but as a matter of professional survival.

This paper tells a fictional, yet historically-grounded, story of how formal verification goes from lofty fantasy to invisible standard throughout the century, and how, without anyone noticing, the proof goes on.