Proceedings of the ACM on Programming Languages: Vol. 10, No. POPL. 2026

Full Citation in the ACM Digital Library

SECTION: Regular Papers

The Complexity of Testing Message-Passing Concurrency

A key computational question underpinning the automated testing and verification of concurrent programs is the consistency questiongiven a partial execution history, can it be completed in a consistent manner? Due to its importance, consistency testing has been studied extensively for memory models, as well as for database isolation levels. A common theme in all these settings is the use of shared-memory as the primal mode of interthread communication. On the other hand, modern programming languages, such as Go, Rust and Kotlin, advocate a paradigm shift towards channel-based (i.e., message-passing) communication. However, the consistency question for channel-based concurrency is currently poorly understood.

In this paper we lift the study of fundamental consistency problems to channels, taking into account various input parameters, such as the number of threads executing, the number of channels, and the channel capacities. We draw a rich complexity landscape, including upper bounds that become polynomial when certain input parameters are fixed, as well as hardness lower bounds. Our upper bounds are based on algorithms that can drive the verification of channel consistency in automated verification tools. Our lower bounds characterize minimal input parameters that are sufficient for hardness to arise, and thus shed light on the intricacies of testing channel-based concurrency. In combination, our upper and lower bounds characterize the boundary of tractability/intractability of verifying channel consistency, and imply that our algorithms are often (nearly) optimal. We have also implemented our main consistency checking algorithm and designed optimizations to enhance its performance. We evaluated the performance of our implementation over a set of 103 instances obtained from open source Go projects, and compared it against a constraint-solving based algorithm. Our experimental results demonstrate the power of our consistency-checking algorithm; it scales to around 1M events, and is significantly faster in running-time performance, compared to a constraint-solving approach.

Let Generalization, Polymorphic Recursion, and Variable Minimization in Boolean-Kinded Type Systems

Recent research has demonstrated the effectiveness of extending the Hindley-Milner (HM) type system with Boolean kinds to support type inference for a wide variety of features. However, the means to support classic type system provisions such as local let generalization and polymorphic recursion is either limited or unknown for such extensions. This paper contributes procedures for equational generalization and semiunification in arbitrary Boolean rings, enabling let generalization and polymorphic recursion in Boolean-kinded type inference. Additionally, methods to minimize the number of bound Boolean type variables are developed to keep types small in these systems. Boolean-kinded HM extensions are exemplified with nullable reference types, and how to use the developed procedures to support let generalization and polymorphic recursion is outlined.

Normalisation for First-Class Universe Levels

Various mechanisms are available for managing universe levels in proof assistants based on type theory. The Agda proof assistant implements a strong form of universe polymorphism in which universe levels are internalised as a type, making levels first-class objects and permitting higher-rank quantification via ordinary Π-types. We prove normalisation and decidability of equality and type-checking for a type theory with first-class universe levels inspired by Agda. We also show that level primitives can safely be erased in extracted programs. Our development is formalised in Agda itself and builds upon previous work which uses logical relations on extrinsically typed syntax.

Qudit Quantum Programming with Projective Cliffords

This paper introduces a novel abstraction for programming quantum operations, specifically projective Cliffords, as functions over the qudit Pauli group. Generalizing the idea behind Pauli tableaux, we introduce a type system and lambda calculus for projective Cliffords called LambdaPC that captures well-formed Clifford operations via a Curry-Howard correspondence with a particular encoding of the Clifford and Pauli groups. In LambdaPC, users write functions that encode projective Cliffords PU P U, and such functions are compiled to circuits executable on modern quantum computers that transform quantum states |ϕ⟩ into U |ϕ⟩, up to a global phase. Importantly, the language captures not just qubit operations, but qudit operations for any dimension d.

Throughout the paper we explore what it means to program with projective Cliffords through a number of examples and a case study focusing on stabilizer error correcting codes.

Hadamard-Pi: Equational Quantum Programming

Quantum computing offers advantages over classical computation, yet the precise features that set the two apart remain unclear. In the standard quantum circuit model, adding a 1-qubit basis-changing gate—commonly chosen to be the Hadamard gate—to a universal set of classical reversible gates yields computationally universal quantum computation. However, the computational behaviours enabled by this addition are not fully characterised. We give such a characterisation by introducing a small quantum programming language extending the universal classical reversible programming language Π with a single primitive corresponding to the Hadamard gate. The language comes equipped with a sound and complete categorical semantics that is specified by a purely equational theory. Completeness is shown by means of a novel finite presentation, and a corresponding synthesis algorithm, for the groups of orthogonal matrices with entries in the ring ℤ[1/√2].

RapunSL: Untangling Quantum Computing with Separation, Linear Combination and Mixing

Quantum Separation Logic (QSL) has been proposed as an effective tool to improve the scalability of deductive reasoning for quantum programs. In QSL, separation is interpreted as disentanglement, and the frame rule brings a notion of entanglement-local specification (one that only talks about the qubits entangled with those acted upon by the program). In this paper, we identify two notions of locality unique to the quantum domain, and we construct a novel quantum separation logic, RapunSL, which is able to soundly reduce reasoning about superposition states to reasoning about pure states (basis-locality), and reasoning about mixed states arising from measurement to reasoning about pure states (outcome-locality). To do so, we introduce two connectives, linear combination and mixing, which together with separation provide a dramatic improvement in the scalability of reasoning, as we demonstrate on a series of challenging case studies.

Hyperfunctions: Communicating Continuations

A hyperfunction is a continuation-like construction that can be used to implement communication in the context of concurrency. Though it has been reinvented many times, it remains somewhat obscure: since its definition by Launchbury et al., hyperfunctions have been used to implement certain algebraic effect handlers, coroutines, and breadth-first traversals; however, in each of these examples, the hyperfunction type went unrecognised. We identify the hyperfunctions hidden in all of these algorithms, and we exposit the common pattern between them, building a framework for working with and reasoning about hyperfunctions. We use this framework to solve a long-standing problem: giving a fully-abstract continuation-based semantics for a concurrent calculus, the Calculus of Communicating Systems. Finally, we use hyperfunctions to build a monadic Haskell library for efficient first-class coroutines.

ArchSem: Reusable Rigorous Semantics of Relaxed Architectures

The specifications of mainstream processor architectures, such as Arm, x86, and RISC-V, underlie modern computing, as the targets of compilers, operating systems, and hypervisors. However, despite extensive research and tooling for instruction-set architecture (ISA) and relaxed-memory semantics, recently including systems features, there still do not exist integrated mathematical models that suffice for foundational formal verification, of concurrent architecture properties or of systems software. Previous proof-assistant work has had to substantially simplify the ISA semantics, the concurrency model, or both.

We present ArchSem, an architecture-generic framework for architecture semantics, modularly combining ISA and concurrency models along a tractable interface of instruction-semantics effects, that covers a range of systems aspects. To do so, one has to handle many issues that were previously unclear, about the architectures themselves, the interface, the proper definition of reusable models, and the Rocq and Isabelle idioms required to make it usable. We instantiate it to the Arm-A and RISC-V instruction-set architectures and multiple concurrency models.

We demonstrate usability for proof, despite the scale, by establishing that the Arm architecture (in a particular configuration) provides a provable virtual memory abstraction, with a combination of Rocq, Isabelle, and paper proof. Previous work provides further confirmation of usability: the AxSL program logic for Arm relaxed concurrency was proved sound above an earlier version of ArchSem.

This establishes a basis for future proofs of architecture properties and systems software, above production architecture specifications.

Probabilistic Concurrent Reasoning in Outcome Logic: Independence, Conditioning, and Invariants

Although randomization has long been used in distributed computing, formal methods for reasoning about probabilistic concurrent programs have lagged behind. No existing program logics can express specifications about the full distributions of outcomes resulting from programs that are both probabilistic and concurrent. To address this, we introduce Probabilistic Concurrent Outcome Logic (pcOL), which incorporates ideas from concurrent and probabilistic separation logics into Outcome Logic to introduce new compositional reasoning principles. At its core, pcOL reinterprets the rules of Concurrent Separation Logic in a setting where separation models probabilistic independence, so as to compositionally describe joint distributions over variables in concurrent threads. Reasoning about outcomes also proves crucial, as case analysis is often necessary to derive precise information about threads that rely on randomized shared state. We demonstrate pcOL on a variety of examples, including to prove almost sure termination of unbounded loops.

Characterizing Sets of Theories That Can Be Disjointly Combined

We study properties that allow first-order theories to be disjointly combined, including stable infiniteness, shininess, strong politeness, and gentleness. Specifically, we describe a Galois connection between sets of decidable theories, which picks out the largest set of decidable theories that can be combined with a given set of decidable theories. Using this, we exactly characterize the sets of decidable theories that can be combined with those satisfying well-known theory combination properties. This strengthens previous results and answers in the negative several long-standing open questions about the possibility of improving existing theory combination methods to apply to larger sets of theories. Additionally, the Galois connection gives rise to a complete lattice of theory combination properties, which allows one to generate new theory combination methods by taking meets and joins of elements of this lattice. We provide examples of this process, introducing new combination theorems. We situate both new and old combination methods within this lattice.

Local Contextual Type Inference

Type inference is essential for programming languages, yet complete and global inference quickly becomes undecidable in the presence of rich type systems like System F. Pierce and Turner proposed local type inference (LTI) as a scalable, partially annotated alternative by relying on information local to applications. While LTI has been widely adopted in practice, there are significant gaps between theory and practice, with its theory being underdeveloped and specifications for LTI being complex and restrictive.

We propose Local Contextual Type Inference, a principled redesign of LTI grounded in contextual typing—a recent formalism which captures type information flow. We present Contextual System F (Fc), a variant of System F with implicit and first-class polymorphism. We formalize Fc using a declarative type system, prove soundness, completeness, and decidability, and introduce matching subtyping as a bridge between declarative and algorithmic inference. This work offers the first mechanized treatment of LTI, while at the same time removing important practical restrictions and also demonstrating the power of contextual typing in designing robust, extensible and simple to implement type inference algorithms.

JAX Autodiff from a Linear Logic Perspective

JAX Autodiff refers to the core of the automatic differentiation (AD) systems developed in projects like JAX and Dex. JAX Autodiff has recently been formalised in a linear typed calculus by Radul et al in POPL 2023. Although this formalisation suffices to express the main program transformations of AD, the calculus is very specific to this task, and it is not clear whether the type system yields a substructural logic that has interest on its own. We propose an encoding of JAX Autodiff into a linear λ-calculus that enjoys a Curry-Howard correspondence with Girard’s linear logic. We prove that the encoding is sound both qualitatively (the encoded terms are extensionally equivalent to the original ones) and quantitatively (the encoding preserves the original work cost as described in Radul et al). As a byproduct, we show that unzipping, one of the transformations used to implement backpropagation in JAX Autodiff, is, in fact, optional.

TypeDis: A Type System for Disentanglement

Disentanglement is a runtime property of parallel programs guaranteeing that parallel tasks remain oblivious to each other's allocations. As demonstrated in the MaPLe compiler and run-time system, disentanglement can be exploited for fast automatic memory management, especially task-local garbage collection with no synchronization between parallel tasks. However, as a low-level property, disentanglement can be difficult to reason about for programmers. The only means of statically verifying disentanglement so far has been DisLog, an Iris-fueled variant of separation logic, mechanized in the Rocq proof assistant. DisLog is a fully-featured program logic, allowing for proof of functional correctness as well as verification of disentanglement. Yet its employment requires significant expertise and per-program proof effort.

This paper explores the route of automatic verification via a type system, ensuring that any well-typed program is disentangled and lifting the burden of carrying out manual proofs from the programmer. It contributes TypeDis, a type system inspired by region types, where each type is annotated with a timestamp, identifying the task that allocated it. TypeDis supports iso-recursive types as well as polymorphism over both types and timestamps. Crucially, timestamps are allowed to change during type-checking, at join points as well as via a form of subtyping, dubbed subtiming. The paper illustrates TypeDis and its features on a range of examples. The soundness of TypeDis and the examples are mechanized in the Rocq proof assistant, using an improved version of DisLog, dubbed DisLog2.

Network Change Validation with Relational NetKAT

Relational NetKAT (RN) is a new specification language for network change validation. Engineers use RN to specify intended changes by providing a trace relation R, which maps existing packet traces in the pre-change network to intended packet traces in the post-change network. The intended set of traces may then be checked against the actual post-change traces to uncover errors in implementation. Trace relations are constructed compositionally from a language of combinators that include trace insertion, trace deletion, and packet transformation, as well as regular operators for concatenation, union, and iteration of relations. We provide algorithms for converting trace relations into a new form of NetKAT transducer and also for constructing an automaton that recognizes the image of a NetKAT automaton under a NetKAT transducer. These algorithms, together with existing decision procedures for NetKAT automaton equivalence, suffice for validating network changes. We provide a denotational semantics for our specification language, prove our compilation algorithms correct, implement a tool for network change validation, and evaluate it on a set of benchmarks drawn from a production network and Amazon’s Batfish toolkit.

Typing Strictness

Strictness analysis is critical to efficient implementation of languages with non-strict evaluation, mitigating much of the performance overhead of laziness. However, reasoning about strictness at the source level can be challenging and unintuitive. We propose a new definition of strictness that refines the traditional one by describing variable usage more precisely. We lay type-theoretic foundations for this definition in both call-by-name and call-by-push-value settings, drawing inspiration from the literature on type systems tracking effects and coeffects. We prove via a logical relation that the strictness attributes computed by our type systems accurately describe the use of variables at runtime, and we offer a strictness-annotation-preserving translation from the call-by-name system to the call-by-push-value one. All our results are mechanized in Rocq.

An Expressive Assertion Language for Quantum Programs

In this paper, we define an assertion language designed for expectation-based reasoning about quantum programs. The key design idea is a representation of quantum predicates by quasi-probability distributions of generalized Pauli operators. Then we extend classical techniques such as Gödelization to prove that this language is expressive with respect to the quantum programs with loops—specifically, for any program S and any postcondition ψ formulated in the assertion language, the weakest precondition of S with respect to ψ can also be expressed as a formula in the assertion language. As an application, we present a sound and relatively complete quantum Hoare logic upon our expressive assertion language.

Fuzzing Guided by Bayesian Program Analysis

We propose a novel approach that leverages Bayesian program analysis to guide large-scale target-guided greybox fuzzing (LTGF). LTGF prioritizes program locations (targets) that are likely to contain bugs and applies directed mutation towards high-priority targets. However, existing LTGF approaches suffer from coarse and heuristic target prioritization strategies, and lack a systematic design to fully exploit feedback from the fuzzing process. We systematically define this prioritization process as the reachable fuzzing targets problem. Bayesian program analysis attaches probabilities to analysis rules and transforms the analysis results into a Bayesian model. By redefining the semantics of Bayesian program analysis, we enable the prediction of whether each target is reachable by the fuzzer, and dynamically adjust the predictions based on fuzzer feedback. On the one hand, Bayesian program analysis builds Bayesian models based on program semantics, enabling systematic and fine-grained prioritization. On the other hand, Bayesian program analysis systematically learns feedback from the fuzzing process, making its guidance adaptive. Moreover, this combination extends the application of Bayesian program analysis from alarm ranking to fully automated bug discovery. We implement our approach and evaluate it against several state-of-the-art fuzzers. On a suite of real-world programs, our approach discovers 3.25 × to 13 × more unique bugs compared to baselines. In addition, our approach identifies 39 previously unknown bugs in well-tested programs, 30 of which have been assigned CVEs.

ChiSA: Static Analysis for Lightweight Chisel Verification

The growing demand for productivity in hardware development opens up new opportunities for applying programming language (PL) techniques to hardware description languages (HDLs). Chisel, a leading agile HDL, embraces this shift by leveraging modern PL features to enhance hardware design productivity. However, verification for Chisel remains a major productivity bottleneck, requiring substantial time and manual effort. To address this issue, we advocate the use of static analysis—a technique proven well-suited to agile development workflows in software—for lightweight Chisel verification.

This work establishes a theoretical foundation for Chisel static analysis. At its core is λC, a formal core calculus of ChAIR (a Chisel-specific intermediate representation for analysis). λC is the first formalism that captures the essence of Chisel while being deliberately minimal to ease rigorous reasoning about static analysis built on λC. We prove key properties of λC that reflect real hardware characteristics, which in turn offer a form of retrospective validation for its design. On the basis of λC, we define and formalize the hardware value flow analysis (HVFA) problem, which underpins our static analyses for critical Chisel verification tasks, including bug detection and security analysis. We then propose a synchronized fixed-point solution to the HVFA problem, featuring hardware-specific treatment of the synchronous behavior of clock-driven hardware registers—the essential feature of Chisel programs. We further prove key theorems establishing the guarantees and limitations of our solution.

As a proof of concept, we develop ChiSA (30K+ LoC)—the first Chisel static analyzer that can analyze intricate hardware value flows to enable lightweight analyses for critical Chisel verification tasks such as bug detection and security analysis. To facilitate thorough evaluation of both ChiSA and future work, we provide ChiSABench (11M+ LoC), a comprehensive benchmark for Chisel static analysis.

Our evaluation on ChiSABench demonstrates that ChiSA offers an effective and significantly more lightweight approach for critical Chisel verification tasks, especially on large and complex real-world designs. For example, ChiSA identified 69 violable developer-inserted assertions in large-scale Chisel designs (9.7M+ LoC) in under 200 seconds—eight of which were recognized by developers and scheduled for future fixes—and detected all 60 information-leak vulnerabilities in the well-known TrustHub benchmark (1.1M+ LoC) in just one second—outperforming state-of-the-art Chisel approaches like ChiselTest’s bounded model checking and ChiselFlow’s secure type system. These results underscore the high promise of static analysis for lightweight Chisel verification. To encourage continued research and innovation, we will fully open-source ChiSA (30K+ LoC) and ChiSABench (11M+ LoC).

General Decidability Results for Systems with Continuous Counters

Counters that hold natural numbers are ubiquitous in modeling and verifying software systems; for example, they model dynamic creation and use of resources in concurrent programs. Unfortunately, such discrete counters often lead to extremely high complexity. Continuous counters are an efficient over-approximation of discrete counters. They are obtained by relaxing the original counters to hold values over the non-negative rational numbers.

This work shows that continuous counters are extraordinarily well-behaved in terms of decidability. Our main result is that, despite continuous counters being infinite-state, the language of sequences of counter instructions that can arrive in a given target configuration, is regular. Moreover, a finite automaton for this language can be computed effectively. This implies that a wide variety of transition systems can be equipped with continuous counters, while maintaining decidability of reachability properties. Examples include higher-order recursion schemes, well-structured transition systems, and decidable extensions of discrete counter systems.

We also prove a non-elementary lower bound for the size of the resulting finite automaton.

Extensible Data Types with Ad-Hoc Polymorphism

This paper proposes a novel language design that combines extensible data types, implemented through row types and row polymorphism, with ad-hoc polymorphism, implemented through type classes. Our design introduces several new constructs and constraints useful for generic operations over rows. We formalize our design in a source calculus λρ, which elaborates into a target calculus Fω⊗⊕. We prove that the target calculus is type-safe and that the elaboration is sound, thus establishing the soundness of λρ. All proofs are mechanized in the Lean 4 proof assistant. Furthermore, we evaluate our type system using the Brown Benchmark for Table Types, demonstrating the utility of extensible rows with type classes for table types.

Optimising Density Computations in Probabilistic Programs via Automatic Loop Vectorisation

Probabilistic programming languages (PPLs) are a popular tool for high-level modelling across many fields. They provide a range of algorithms for probabilistic inference, which analyse models by learning their parameters from a dataset or estimating their posterior distributions. However, probabilistic inference is known to be very costly. One of the bottlenecks of probabilistic inference stems from the iteration over entries of a large dataset or a long series of random samples. Vectorisation can mitigate this cost, but manual vectorisation is error-prone, and existing automatic techniques are often ad-hoc and limited, unable to handle general repetition structures, such as nested loops and loops with data-dependent control flow, without significant user intervention. To address this bottleneck, we propose a sound and effective method for automatically vectorising loops in probabilistic programs. Our method achieves high throughput using speculative parallel execution of loop iterations, while preserving the semantics of the original loop through a fixed-point check. We formalise our method as a translation from an imperative PPL into a lower-level target language with primitives geared towards vectorisation. We implemented our method for the Pyro PPL and evaluated it on a range of probabilistic models. Our experiments show significant performance gains against an existing vectorisation baseline, achieving 1.1–6× speedups and reducing GPU memory usage in many cases. Unlike the baseline, which is limited to a subset of models, our method effectively handled all the tested models.

AdapTT: Functoriality for Dependent Type Casts

The ability to cast values between related types is a leitmotiv of many flavors of dependent type theory, such as observational type theories, subtyping, or cast calculi for gradual typing. These casts all exhibit a common structural behavior that boils down to the pervasive functoriality of type formers. We propose and extensively study a type theory, called AdapTT, which makes systematic and precise this idea of functorial type formers, with respect to an abstract notion of adapters relating types. Leveraging descriptions for functorial inductive types in AdapTT, we derive structural laws for type casts on general inductive type formers.

Quotient Polymorphism

Quotient types increase the power of type systems by allowing types to include equational properties. However, two key practical issues arise: code being duplicated, and valid code being rejected. Specifically, function definitions often need to be repeated for each quotient of a type, and valid functions may be rejected if they include subterms that do not respect the quotient. This article addresses these reusability and expressivity issues by introducing a notion of quotient polymorphism that we call choice polymorphism. We give practical examples of its use, develop the underlying theory, and implement it in Quotient Haskell.

On Circuit Description Languages, Indexed Monads, and Resource Analysis

In this paper, a monad-based denotational model is introduced and shown adequate for the Proto-Quipper family of calculi, themselves being idealized versions of the Quipper programming language. The use of a monadic approach allows us to separate the value to which a term reduces from the circuit that the term itself produces as a side effect. In turn, this enables the denotational interpretation and validation of rich type systems in which the size of the produced circuit can be controlled. Notably, the proposed semantic framework, through the novel concept of circuit algebra, suggests forms of effect typing guaranteeing quantitative properties about the resulting circuit, even in presence of optimizations.

Towards Pen-and-Paper-Style Equational Reasoning in Interactive Theorem Provers by Equality Saturation

Equations are ubiquitous in mathematical reasoning. Often, however, they only hold under certain conditions. As these conditions are usually clear from context, mathematicians regularly omit them when performing equational reasoning on paper. In contrast, interactive theorem provers pedantically insist on every detail to be convinced that a theorem holds, hindering equational reasoning at the more abstract level of pen-and-paper mathematics. In this paper, we address this issue by raising the level of equational reasoning to enable pen-and-paper style in interactive theorem provers. We achieve this by interpreting theorems as conditional rewrite rules, and use equality saturation to automatically derive equational proofs. Conditions that cannot be automatically proven may be surfaced as proof obligations. Concretely, we present how to interpret theorems as conditional rewrite rules for a significant class of theorems. Handling these theorems goes beyond simple syntactic rewriting, and deals with aspects like propositional conditions and type classes. We evaluate our approach by implementing it as a tactic in Lean, using the egg library for equality saturation with e-graphs. We show four use cases demonstrating the efficacy of this higher level of abstraction for equational reasoning.

All for One and One for All: Program Logics for Exploiting Internal Determinism in Parallel Programs

Nondeterminism makes parallel programs challenging to write and reason about. To avoid these challenges, researchers have developed techniques for internally deterministic parallel programming, in which the steps of a parallel computation proceed in a deterministic way. Internal determinism is useful because it lets a programmer reason about a program as if it executed in a sequential order. However, no verification framework exists to exploit this property and simplify formal reasoning about internally deterministic programs.

To capture the essence of why internally deterministic programs should be easier to reason about, this paper defines a property called schedule-independent safety. A program satisfies schedule-independent safety, if, to show that the program is safe across all orderings, it suffices to show that one terminating execution of the program is safe. We then present a separation logic called Musketeer for proving that a program satisfies schedule-independent safety. Once a parallel program has been shown to satisfy schedule-independent safety, we can verify it with a new logic called Angelic, which allows one to dynamically select and verify just one sequential ordering of the program.

Using Musketeer, we prove the soundness of MiniDet, an affine type system for enforcing internal determinism. MiniDet supports several core algorithmic primitives for internally deterministic programming that have been identified in the research literature, including a deterministic version of a concurrent hash set. Because any syntactically well-typed MiniDet program satisfies schedule-independent safety, we can apply Angelic to verify such programs.

All results in this paper have been verified in Rocq using the Iris separation logic framework.

Security Reasoning via Substructural Dependency Tracking

Substructural type systems provide the ability to speak about resources. By enforcing usage restrictions on inputs to computations they allow programmers to reify limited system units—such as memory—in types. We demonstrate a new form of resource reasoning founded on constraining outputs and explore its utility for practical programming. In particular, we identify a number of disparate programming features explored largely in the security literature as various fragments of our unified framework. These encompass capabilities, quantitative information leakage, sandboxing in the style of the Linux seccomp interface, authorization protocols, and more. We furthermore explore its connection to conventional input-based resource reasoning, casting it as an internal treatment of the constructive Kripke semantics of substructural logics. We verify the capability, quantity, and protocol safety of our system through a single logical relations argument. In doing so, we take the first steps towards obtaining the ultimate multitool for security reasoning.

Dependent Coeffects for Local Sensitivity Analysis

Differential privacy is a formal definition of privacy that bounds the maximum acceptable information leakage when a query is performed on sensitive data. To ensure this property, a key technique involves bounding the query’s sensitivity (how much input variations affect the output) and adding noise to the result according to this quantity. While prior work like the Fuzz type system focuses on global sensitivity, many useful queries have infinite global sensitivity, restricting the scope of such approaches. This limitation can be addressed by considering a more fine-grained measure: local sensitivity, which quantifies output change for inputs adjacent to a specific dataset. In this article, we introduce Local Fuzz, a type system with dependent coeffects designed to bound the local sensitivity of programs written in a simple functional language. We provide a denotational semantics for this system in the category of extended premetric spaces, leveraging the recently introduced construction of a dependently graded comonad. Finally, we illustrate how Local Fuzz can lead to better differential privacy guarantees than Fuzz, both for mechanisms that rely on global sensitivity and for those that leverage local sensitivity, such as the Propose-Test-Release framework.

Separating the Wheat from the Chaff: Understanding (In-)Completeness of Proof Mechanisms for Separation Logic with Inductive Definitions

For over two decades Separation Logic has enjoyed its unique position as arguably the most popular framework for reasoning about heap-manipulating programs, as well as reasoning about shared resources and permissions. Separation Logic is often extended to include inductively-defined predicates, interpreted as least fixpoints, to form what is known as Separation Logic with Inductive Definitions (SLID). These inductive predicates are used to describe unbounded data-structures in the heap and to verify key properties thereof. Many theoretical and practical advances have been made in developing automated proof mechanisms for SLID, but by their very nature these mechanisms are imperfect, and a deeper understanding of their failures is desired. As expressive as Separation Logic is, it is not surprising that it is incomplete: there is no procedure that will provide a proof for all valid entailments in Separation Logic. In fact, at its very core, Separation Logic contains several sources of incompleteness that defy automated reasoning.

In this paper we study these sources of incompleteness and how they relate to failures of proof mechanisms of SLID. We contextualize SLID within a larger, relaxed logic, that we call Weak Separation Logic (WSL). We prove that unlike SLID, WSL enjoys completeness for a non-trivial fragment of quantified entailments with background theories and inductive definitions, via a reduction to first-order logic (FOL). Moreover, we show that the ubiquitous fold/unfold proof mechanism, which is unsurprisingly incomplete for SLID, does constitute a sound and complete proof mechanism of WSL, for theory-free, quantifier-free entailments with inductive definitions. In some sense, this shows that WSL is the natural logic of such proof mechanisms. Through this contextualization of SLID within WSL, we understand proof failures as stemming from rogue, nonstandard models, that exist within the class of models considered by WSL, but do not adhere to the stricter requirements of SLID. These rogue models are typically infinite, and we use the recently proposed formalism of symbolic structures to represent and automatically find them.

We present a prototype tool that implements the encoding of WSL to FOL and test it on an existing benchmark, which contains over 700 quantified entailment problems with inductive definitions, a third of which also contain background theories. Our tool is able to find counter-models to many of the examples, and we provide a partial taxonomy of the rogue models, shedding some light on real-world proof failures.

Algorithmic Conversion with Surjective Pairing: A Syntactic and Untyped Approach

In a dependent type theory with β-equivalence as its equational theory, the confluence of untyped reduction and termination immediately give us a proof of the decidability of type conversion, where the decision procedure for convertibility simply checks the equality of the β-normal forms of its inputs. This technique is not available in the presence of surjective pairing (i.e. the η-law for pairs) because βη-reduction is not confluent. In this work, we show that by adopting established syntactic techniques, we can resolve the issue with confluence caused by surjective pairing, and recover a confluence-based proof of decidability of type conversion. Compared to existing proof developments, which rely on semantic tools such as Kripke-style logical relations, our proof modularly composes a minimal semantic proof of untyped normalization and a syntactic proof of decidability. This modularity enables us to explore algorithmic conversion through syntactic methods without modifying the minimal semantic proof. We have fully mechanized our results using the Rocq theorem prover.

Abstraction Functions as Types: Modular Verification of Cost and Behavior in Dependent Type Theory

Software development depends on the use of libraries whose public specifications inform client code and impose obligations on private implementations; it follows that verification at scale must also be modular, preserving such abstraction. Hoare's influential methodology uses abstraction functions to demonstrate the coherence between such concrete implementations and their abstract specifications. However, the Hoare methodology relies on a conventional separation between implementation and specification, providing no linguistic support for ensuring that this convention is obeyed.

This paper proposes a synthetic account of Hoare's methodology within univalent dependent type theory by encoding the data of abstraction functions within types themselves. This is achieved via a phase distinction, which gives rise to a gluing construction that renders an abstraction function as a type and a pair of modalities that fracture a type into its concrete and abstract parts. A noninterference theorem governing the phase distinction characterizes the modularity guarantees provided by the theory.

This approach scales to verification of cost, allowing the analysis of client cost relative to a cost-aware specification. A monadic sealing effect facilitates modularity of cost, permitting an implementation to be upper-bounded by its specification in cases where private details influence observable cost. The resulting theory supports modular development of programs and proofs in a manner that hides private details of no concern to clients while permitting precise specifications of both the cost and behavior of programs.

Rows and Capabilities as Modal Effects

Effect handlers allow programmers to model and compose computational effects modularly. Effect systems statically guarantee that all effects are handled. Several recent practical effect systems are based on either row polymorphism or capabilities. However, there remains a gap in understanding the precise relationship between effect systems with such disparate foundations. The main difficulty is that in both row-based and capability-based systems, effect tracking is typically entangled with other features such as functions.

We propose a uniform framework for encoding, analysing, and comparing effect systems. Our framework exploits and generalises modal effect types, a recent novel effect system which decouples effect tracking from functions via modalities. Modalities offer fine-grained control over when and how effects are tracked, enabling us to express different strategies for effect tracking. We give encodings as macro translations from existing row-based and capability-based effect systems into our framework and show that these encodings preserve types and semantics. Our encodings reveal the essence of effect tracking mechanisms in different effect systems, enable a direct analysis on their differences, and provide practical insights on language design.

Tropical Mathematics and the Lambda-Calculus II: Tropical Geometry of Probabilistic Programming Languages

In the last few years there has been a growing interest towards methods for statistical inference and learning based on computational geometry and, notably, tropical geometry, that is, the study of algebraic varieties over the min-plus semiring. At the same time, recent work has demonstrated the possibility of interpreting higher-order probabilistic programming languages in the framework of tropical mathematics, by exploiting algebraic and categorical tools coming from the semantics of linear logic. In this work we combine these two worlds, showing that tools and ideas from tropical geometry can be used to perform statistical inference over higher-order probabilistic programs. Notably, we first show that each such program can be associated with a degree and a n-dimensional polyhedron that encode its most likely runs. Then, we use these tools in order to design an intersection type system that estimates most likely runs in a compositional and efficient way.

A Relational Separation Logic for Effect Handlers

Effect handlers offer a powerful and relatively simple mechanism for controlling a program's flow of execution. Since their introduction, an impressive array of verification tools for effect handlers has been developed. However, to this day, no framework can express and prove relational properties about programs that use effect handlers in languages such as OCaml and Links, where programming features like mutable state and concurrency are readily available. To this end, we introduce blaze, the first relational separation logic for effect handlers. We build blaze on top of the Rocq implementation of the Iris separation logic, thereby enjoying the rigour of a mechanised theory and all the reasoning properties of a modern fully-fledged concurrent separation logic, such as modular reasoning about stateful concurrent programs and the ability to introduce user-defined ghost state. In addition to familiar reasoning rules, such as the bind rule and the frame rule, blaze offers rules to reason modularly about programs that perform and handle effects. Significantly, when verifying that two programs are related, blaze does not require that effects and handlers from one program be in correspondence with effects and handlers from the other. To assess this flexibility, we conduct a number of case studies: most noticeably, we show how different implementations of an asynchronous-programming library using effects are related to truly concurrent implementations. As side contributions, we introduce two new, simple, and general reasoning rules for concurrent relational separation logic that are independent of effects: a logical-fork rule that allows one to reason about an arbitrary program phrase as if it had been spawned as a thread and a thread-swap rule that allows one to reason about how threads are scheduled.

Compiling to Linear Neurons

We don’t program neural networks directly. Instead, we rely on an indirect style where learning algorithms, like gradient descent, determine a neural network’s function by learning from data. This indirect style is often a virtue; it empowers us to solve problems that were previously impossible. But it lacks discrete structure. We can’t compile most algorithms into a neural network—even if these algorithms could help the network learn. This limitation occurs because discrete algorithms are not obviously differentiable, making them incompatible with the gradient-based learning algorithms that determine a neural network’s function. To address this, we introduce Cajal: a typed, higher-order and linear programming language intended to be a minimal vehicle for exploring a direct style of programming neural networks. We prove Cajal programs compile to linear neurons, allowing discrete algorithms to be expressed in a differentiable form compatible with gradient-based learning. With our implementation of Cajal, we conduct several experiments where we link these linear neurons against other neural networks to determine part of their function prior to learning. Linking with these neurons allows networks to learn faster, with greater data-efficiency, and in a way that’s easier to debug. A key lesson is that linear programming languages provide a path towards directly programming neural networks, enabling a rich interplay between learning and the discrete structures of ordinary programming.

Handling Higher-Order Effectful Operations with Judgemental Monadic Laws

This paper studies the design of programming languages with handlers of higher-order effectful operations – effectful operations that may take in computations as arguments or return computations as output. We present and analyse a core calculus with higher-kinded impredicative polymorphism, handlers of higher-order effectful operations, and optionally general recursion. The distinctive design choice of this calculus is that handlers are carried by lawless raw monads, while the computation judgements still satisfy the monadic laws judgementally. We present the calculus with a logical framework and give denotational models of the calculus using realizability semantics. We prove closed-term canonicity and parametricity for the recursion-free fragment of the language using synthetic Tait computability and a novel form of the ⊤⊤-lifting technique.

Accelerating Syntax-Guided Program Synthesis by Optimizing Domain-Specific Languages

Syntax-guided program synthesis relies on domain-specific languages (DSLs) to constrain the search space and improve efficiency. However, manually designing optimal DSLs is challenging and often results in suboptimal performance. In this paper, we propose AMaze, a novel framework that automatically optimizes DSLs to accelerate synthesis. AMaze iteratively refines a DSL by identifying key program fragments, termed feature components, whose enumeration ranks correlate with synthesis time. Using a dynamic-programming-based algorithm to calculate enumeration ranks of feature components and a machine learning model based on them, AMaze estimates synthesis cost instead of directly invoking the synthesizer, which is impractical due to high computational cost. We evaluate AMaze on state-of-the-art synthesizers, including DryadSynth, Duet, Polygen, and EUsolver, across multiple domains. Empirical results demonstrate that AMaze achieves up to 4.35X speedup, effectively reducing synthesis time while maintaining expressiveness.

Parametrised Verification of Intel-x86 Programs

We address the reachability problem for concurrent programs with an arbitrary number of threads running over the Intel x-86 architecture. We consider the formal model eTSO for Intel x-86 defined by Raad et al. in POPL 2022. This model takes into account multiple memory types and non-temporal writes, combining in a complex way features in the TSO and PSO weak memory models. In PLDI 2024, Abdulla et al. proved that this problem is undecidable for eTSO in general, but that it is decidable under k-alternation bounding when computations have, for some fixed bound k, at most k alternations of TSO segments (with TSO writes only) and PSO segments (with PSO writes only) for each thread. The proof of this result assumes that the number of threads is fixed, and relies crucially on referring to thread identities. In this paper, we prove the decidability of the k-alternation bounded reachability problem of eTSO in the parametrized setting when the number of threads is a parameter that can be arbitrarily high. The proof is nontrivial as it cannot refer to thread identities, their number being unbounded. We show that it is possible to overcome this difficulty using a novel and quite complex reduction to reachability in well-structured systems on domains that are BQOs (Better Quasi Orders). This is the first time that BQO-based well-structured systems are used to prove the decidability of verification of infinite state programs.

Handling Scope Checks: A Comparative Framework for Dynamic Scope Extrusion Checks

Metaprogramming and effect handlers interact in unexpected, and sometimes undesirable, ways. One example is scope extrusion: the generation of ill-scoped code. Scope extrusion can either be preemptively prevented, via static type systems, or retroactively detected, via dynamic checks. Static type systems exist in theory, but struggle with a range of implementation and usability problems in practice. In contrast, dynamic checks exist in practice (e.g. in MetaOCaml), but are understudied in theory. Designers of metalanguages are thus given little guidance regarding the design and implementation of checks. We present the first formal study of dynamic scope extrusion checks, introducing a calculus (λ⟨⟨op⟩⟩) for describing and evaluating checks. Further, we introduce a novel dynamic check — the “Cause-for-Concern” check — which we prove correct, characterise without reference to its implementation, and argue combines the advantages of existing dynamic checks. Finally, we extend our framework with refined environment classifiers, which statically prevent scope extrusion, and compare their expressivity with the dynamic checks.

Endangered by the Language But Saved by the Compiler: Robust Safety via Semantic Back-Translation

It is common for programmers to assemble their programs from a combination of trusted and untrusted components. In this context, a trusted program component is said to be robustly safe if it behaves safely when linked against arbitrary untrusted code. Prior work has shown how various encapsulation mechanisms (in both high- and low-level languages) can be used to protect code so that it is robustly safe, but none of the existing work has explored how robust safety can be achieved in a patently unsafe language like C.

In this paper, we show how to bring robust safety to a simple yet representative C-like language we call Rec. Although Rec (like C) is inherently ”dangerous” and thus not robustly safe, we can ”save” Rec programs via compilation to Cap, a CHERI-like capability machine. To formalize the benefits of such a hardening compiler, we develop Reckon, a separation logic for verifying robust safety of Rec programs. Reckon is not sound under Rec’s unsafe, C-like semantics, but it is sound when Rec programs are hardened via compilation and linked against untrusted code running on Cap. As a crucial step in proving soundness of Reckon, we introduce a novel technique of semantic back-translation, which we formalize by building on the DimSum framework for multi-language semantics. All our results are mechanized in the Rocq prover.

Arbitration-Free Consistency Is Available (and Vice Versa)

The fundamental tension between availability and consistency shapes the design of distributed storage systems. Classical results capture extreme points of this trade-off: the CAP theorem shows that strong models like linearizability preclude availability under partitions, while weak models like causal consistency remain implementable without coordination. These theorems apply to simple read-write interfaces, leaving open a precise explanation of the combinations of object semantics and consistency models that admit available implementations.

This paper develops a general semantic framework in which storage specifications combine operation semantics and consistency models. The framework encompasses a broad range of objects (key-value stores, counters, sets, CRDTs, and SQL databases) and consistency models (from causal consistency and sequential consistency to snapshot isolation and bounded staleness).

Within this framework, we prove the Arbitration-Free Consistency (AFC) theorem, showing that an object specification within a consistency model admits an available implementation if and only if it is arbitration-free, that is, it does not require a total arbitration order to resolve visibility or read dependencies.

The AFC theorem unifies and generalizes previous results, revealing arbitration-freedom as the fundamental property that delineates coordination-free consistency from inherently synchronized behavior.

The Ghosts of Empires: Extracting Modularity from Interleaving-Based Proofs

Implementation bugs threaten the soundness of algorithmic software verifiers. Generating correctness certificates for correct programs allows for efficient independent validation of verification results, and thus helps to reveal such bugs. Automatic generation of small, compact correctness proofs for concurrent programs is challenging, as the correctness arguments may depend on the particular interleaving, which can lead to exponential explosion. We present an approach that converts an interleaving-based correctness proof, as generated by many algorithmic verifiers, into a thread-modular correctness proof in the style of Owicki and Gries. We automatically synthesize ghost variables that capture the relevant interleaving information, and abstract away irrelevant details. Our evaluation shows that the approach is efficient in practice and generates compact proofs, compared to a baseline.

Canonicity for Indexed Inductive-Recursive Types

We prove canonicity for a Martin-Löf type theory with a countable universe hierarchy where each universe supports indexed inductive-recursive (IIR) types. We proceed in two steps. First, we construct IIR types from inductive-recursive (IR) types and other basic type formers, in order to simplify the subsequent canonicity proof. The constructed IIR types support the same definitional computation rules that are available in Agda's native IIR implementation. Second, we give a canonicity proof for IR types, building on the established method of gluing along the global sections functor. The main idea is to encode the canonicity predicate for each IR type using a metatheoretic IIR type.

Context-Free-Language Reachability for Almost-Commuting Transition Systems

We extend the scope of context-free-language (CFL) reachability to a new class of infinite-state systems. Parikh’s Theorem is a useful tool for solving CFL-reachability problems for transition systems that consist of commuting transition relations. It implies that the image of a context-free language under a homomorphism into a commutative monoid is semi-linear, and that there is a linear-time algorithm for constructing a Presburger arithmetic formula that represents it. However, for many transition systems of interest, transitions do not commute.

In this paper, we introduce almost-commuting transition systems, which pair finite-state control with commutative components, but which are in general not commutative. We extend Parikh’s theorem to show that the image of a context-free language under a homomorphism into an almost-commuting monoid is semi-linear and that there is a polynomial-time algorithm for constructing a Presburger arithmetic formula that represents it. This result yields a general framework for solving CFL-reachability problems over almost-commuting transition systems. We describe several examples of systems within this class. Finally, we examine closure properties of almost-commuting monoids that can be used to modularly compose almost-commuting transition systems while remaining in the class.

Recurrence Sets for Proving Fair Non-termination under Axiomatic Memory Consistency Models

Recurrence sets characterize non-termination in sequential programs. We present a generalization of recurrence sets to concurrent programs that run on weak memory models. Sequential programs have operational semantics in terms of states and transitions, and classical recurrence sets are defined as sets of states that are existentially closed under transitions. Concurrent programs have axiomatic semantics in terms of executions, and our new recurrence sets are defined as sets of executions that are existentially closed under extensions.

The semantics of concurrent programs is not only affected by the memory model, but also by fairness assumptions about its environment, be it the scheduler or the memory subsystems. Our new recurrence sets are formulated relative to such fairness assumptions. We show that our recurrence sets are sound for proving fair non-termination on all practical memory models, and even complete on many.

To turn our theory into practice, we develop a new automated technique for proving fair non-termination in concurrent programs on weak memory models. At the heart of this technique is a finite representation of recurrence sets in terms of execution-based lassos. We implemented a lasso-finding algorithm in Dartagnan, and evaluated it on a number of programs running under CPU and GPU memory models.

U-Turn: Enhancing Incorrectness Analysis by Reversing Direction

O'Hearn's Incorrectness Logic (IL) has sparked renewed interest in static analyses that aim to detect program errors rather than prove their absence, thereby avoiding false alarms—a critical factor for practical adoption in industrial settings. As new incorrectness logics emerge to capture diverse error-related properties, a key question arises: can combining correctness and incorrectness techniques enhance precision, expressiveness, automation, or scalability? Notable frameworks, such as outcome logic, UNTer, local completeness logic, and exact separation logic, unify multiple analyses within a single proof system. In this work, we adopt a complementary strategy. Rather than designing a unified logic, we combine IL, which identifies reachable error states, with Sufficient Incorrectness Logic (SIL), which finds input states potentially leading to those errors. As a result, we get a more informative and effective analysis than either logic in isolation. Rather than sequencing them, our key innovation is reusing heuristic choices from the first analysis to steer the second. In fact, both IL and SIL rely on under-approximation and thus their automation legitimizes heuristics that avoid exhaustive path enumeration (e.g., selective disjunct pruning, loop unrolling). Concretely, we instrument the proof rules of the second logic with derivations from the first to inductively guide rule selection and application. To our knowledge, this is the first rule format enabling such inter-analysis instrumentation. This combined analysis aids debugging and testing by revealing both reachable errors and their causes, and opens new avenues for embedding incorrectness insights into scalable, expressive, automated code contracts.

The Simple Essence of Boolean-Algebraic Subtyping: Semantic Soundness for Algebraic Union, Intersection, Negation, and Equi-recursive Types

Boolean-algebraic subtyping (BAS) is a powerful subtyping approach introduced in 2022 as the "secret sauce" enabling backtracking-free principal type inference in the MLstruct research language, a structurally-typed functional programming language with tagged records, tag and record subtyping, and tag-based pattern matching. By supporting distributive intersection, union, negation, and equi-recursive types, MLstruct can express powerful programming patterns, such as subtyped extensible variants, without needing row variables. But the use of atypical subtyping rules that violate some interpretations of intersection and union types, the mutual distributivity between these types, and the complexity of coinductive reasoning for equi-recursive types have collectively made the study of BAS difficult. The syntactic soundness proofs provided in the original work are dauntingly complicated and long-winded, obscuring the intuitions behind the correctness of BAS.

In this paper, we distill the simple essence of Boolean-algebraic subtyping: we discover that BAS can be understood through five families of characteristic Boolean homomorphisms defined on types in context. Two of these map to power sets of simpler objects; the rest map back to types, but under an unguarded coinductive assumptions context. Together, these homomorphisms let us prove rather directly that BAS is sound, in that it does not relate constructors of incompatible runtime shapes. These homomorphisms are characteristic in the sense that they are sufficient to capture the meaning of subyping: we prove that if an inequality holds between two types under all these homomorphisms, then subtyping holds between the two types in the original context. This directly suggests a new subtyping decision procedure for BAS, which avoids some inefficiencies in the original algorithm, although it still has exponential worst-case time complexity. We prove that the subtyping problem is in fact co-NP-hard even without recursive types. Finally, we discover that BAS is already powerful enough to encode the removal of a field from a type. This allows us to support extensible records through one new term form and one new typing rule, but, perhaps surprisingly, no changes to subtyping at all.

Our new approach to the semantics of BAS sheds some light on the core of MLstruct’s type system. It could be adapted to other languages with algebraic flavors of subtyping, such as Scala 3 and Ceylon, making their design and verification more approachable. Tellingly, all our subtyping soundness proofs fit inside the main body of this paper, with only some administrative lemmas relegated to the appendix.

Miri: Practical Undefined Behavior Detection for Rust

The Rust programming language has two faces: on the one hand, it is a high-level language with a strong type system ensuring memory and thread safety. On the other hand, Rust crucially relies on unsafe code for cases where the compiler is unable to statically ensure basic safety properties. The challenges of writing unsafe Rust are similar to those of writing C or C++: a single mistake in the program can lead to Undefined Behavior, which means the program is no longer described by the language's Abstract Machine and can go wrong in arbitrary ways, often causing security issues.

Ensuring the absence of Undefined Behavior bugs is therefore a high priority for unsafe Rust authors. In this paper we present Miri, the first tool that can find all de-facto Undefined Behavior in deterministic Rust programs. Some of the key non-trivial features of Miri include tracking of pointer provenance, validation of Rust type invariants, data-race detection, exploration of weak memory behaviors, and implementing enough basic OS APIs (such as file system access and concurrency primitives) to be able to run unchanged real-world Rust code. In an evaluation on more than 100 000 Rust libraries, Miri was able to successfully execute more than 70% of the tests across their combined test suites. Miri has found dozens of real-world bugs and has been integrated into the continuous integration of the Rust standard library and many prominent Rust libraries, preventing many more bugs from ever entering these codebases.

Verifying Almost-Sure Termination for Randomized Distributed Algorithms

We present a technique for the verification of liveness properties of randomized distributed algorithms. Our technique gives SMT-based proofs for many common consensus algorithms, both for crash faults and for Byzantine faults. It is based on a sound proof rule for fair almost-sure termination of distributed systems that combines martingale-based techniques for almost-sure termination with reasoning about weak fairness.

Our proof rule is able to handle parametrized protocols where the state grows unboundedly and every variant function is unbounded. These protocols were out of scope for previous approaches, which either relied on bounded variant functions or on reductions to (non-probabilistic) fairness.

We have implemented our proof rules on top of Caesar, a program verifier for probabilistic programs. We use our proof rule to give SMT-based proofs for termination properties of randomized asynchronous consensus protocols, including Ben-Or's protocol and graded binary consensus, for both crash and Byzantine faults. These protocols have notoriously difficult proofs of termination but fall within the scope of our proof rule.

A Synthetic Reconstruction of Multiparty Session Types

Multiparty session types (MPST) provide a rigorous foundation for verifying the safety and liveness of concurrent systems. However, existing approaches often force a difficult trade-off: classical, projection-based techniques are compositional but limited in expressiveness, while more recent techniques achieve higher expressiveness by relying on non-compositional, whole-system model checking, which scales poorly.

This paper introduces a new approach to MPST that delivers both expressiveness and compositionality, called the synthetic approach. Our key innovation is a type system that verifies each process directly against a global protocol specification, represented as a labelled transition system (LTS) in general, with global types as a special case. This approach uniquely avoids the need for intermediate local types and projection.

We demonstrate that our approach, while conceptually simpler, supports a benchmark of challenging protocols that were previously beyond the reach of compositional techniques in the MPST literature. We generalise our type system, showing that it can validate processes against any specification that constitutes a "well-behaved" LTS, supporting protocols not expressible with the standard global type syntax. The entire framework, including all theorems and many examples, has been formalised and mechanised in Agda, and we have developed a prototype implementation as an extension to VS Code.

A Modular Static Cost Analysis for GPU Warp-Level Parallelism

Graphics Processing Units (GPUs) are the accelerator of choice for performance-critical applications, yet optimizing for performance requires mastery of the complex interactions between its memory architecture and its execution model. Existing static analysis tools for GPU kernels either identify performance bugs without quantifying costs or cannot handle thread-divergent control flow, leading to significant over-approximations. We present the first static relational-cost analysis for GPU warp-level parallelism that can give exact bounds even in the presence of thread divergence. Our analysis is general and flexible, as it is parametric on the resource metric (uncoalesced accesses, bank conflicts) and on the cost relation (=, ≤, ≥). We establish a soundness theorem for our technique, provide mechanized proofs in Rocq and implement our theory in a tool called Pico. In a reproducibility experiment, Pico produced the tightest bounds in every input, outperforming the state-of-the-art tool RaCUDA in 10 kernels (1.7×better), while RaCUDA produced 4 incorrect bounds and crashed on 2 kernels. In an experiment to measure the accuracy of Pico, we studied the impact of thread-divergence in control-flow in a dataset of 226 kernels. We found that at least 75.3% of conditionals and 85.4% of loops can be captured exactly, without introducing approximation.

Inductive Program Synthesis by Meta-Analysis-Guided Hole Filling

A popular approach to inductive program synthesis is to construct a target program via top-down search, starting from an incomplete program with holes and gradually filling these holes until a solution is found. During the search, abstraction-based pruning is used to eliminate infeasible candidate programs, significantly reducing the search space. Because of this pruning, the order in which holes are filled can drastically affect search efficiency: a wise choice can prune large swaths of the search space early, while a poor choice might explore many dead-ends. However, the choice of hole-filling order is largely unattended in program synthesis literature.

In this paper, we propose a novel hole-filling strategy that leverages abstract interpretation to guide the order of hole-filling in program synthesis. Our approach overapproximates the behavior of the underlying abstract interpreter for pruning, enabling it to predict the most promising hole to fill next. We instantiate our approach to the domains of bitvectors and strings, which are commonly used in program synthesis tasks. We evaluate our approach on a set of benchmarks from the prior work, including SyGuS benchmarks, and show that it significantly outperforms the state-of-the-art approaches in terms of efficiency thanks to the abstract abstract interpretation techniques.

A Lazy, Concurrent Convertibility Checker

Convertibility checking — determining whether two lambda-terms are equal up to reductions — is a crucial component of proof assistants and dependently-typed languages. Practical implementations often use heuristics to quickly conclude that two terms are convertible, or are not convertible, without reducing them to normal form. However, these heuristics can backfire, triggering huge amounts of unnecessary computation. This paper presents a novel convertibility-checking algorithm that relies crucially on laziness and concurrency. Laziness is used to share computations, while concurrency is used to explore multiple convertibility subproblems in parallel or via fair interleaving. Unlike heuristics-based approaches, our algorithm always finds an easy solution to the convertibility problem, if one exists. The paper describes the algorithm in process calculus style, discusses its complexity, and reports on its mechanized proof of partial correctness and its lightweight experimental evaluation.

Bayesian Separation Logic: A Logical Foundation and Axiomatic Semantics for Probabilistic Programming

Bayesian probabilistic programming languages (BPPLs) let users denote statistical models as code while the interpreter infers the posterior distribution. The semantics of BPPLs are usually mathematically complex and unable to reason about desirable properties such as expected values and independence of random variables. To reason about these properties in a non-Bayesian setting, probabilistic separation logics such as PSL and Lilac interpret separating conjunction as probabilistic independence of random variables. However, no existing separation logic can handle Bayesian updating, which is the key distinguishing feature of BPPLs.

To close this gap, we introduce Bayesian separation logic (BaSL), a probabilistic separation logic that gives semantics to BPPL. We prove an internal version of Bayes’ theorem using a result in measure theory known as the Rokhlin-Simmons disintegration theorem. Consequently, BaSL can model probabilistic programming concepts such as Bayesian updating, unnormalised distribution, conditional distribution, soft constraint, conjugate prior and improper prior while maintaining modularity via the frame rule. The model of BaSL is based on a novel instantiation of Kripke resource monoid via σ-finite measure spaces over the Hilbert cube, and the semantics of Hoare triple is compatible with an existing denotational semantics of BPPL based on the category of s-finite kernels. Using BaSL, we then prove properties of statistical models such as the expected value of Bayesian coin flip, correlation of random variables in the collider Bayesian network, the posterior distributions of the burglar alarm model, a parameter estimation algorithm, and the Gaussian mixture model.

Higher-Order Behavioural Conformances via Fibrations

Coinduction is a widely used technique for establishing behavioural equivalence of programs in higher-order languages. In recent years, the rise of languages with quantitative (e.g. probabilistic) features has led to extensions of coinductive methods to more refined types of behavioural conformances, most notably notions of behavioural distance. To guarantee soundness of coinductive reasoning, one needs to show that the behavioural conformance at hand forms a program congruence, i.e. it is suitably compatible with the operations of the language. This is usually achieved by a complex proof technique known as Howe’s method, which needs to be carefully adapted to both the specific language and the targeted notion of behavioural conformance. We develop a uniform categorical approach to Howe’s method that features two orthogonal dimensions of abstraction: (1) the underlying higher-order language is modelled by an abstract higher-order specification (AHOS), a novel and very general categorical account of operational semantics, and (2) notions of behavioural conformance (such as relations or metrics) are modelled via fibrations over the base category of an AHOS. Our main result is a fundamental congruence theorem at this level of generality: Under natural conditions on the categorical ingredients and the operational rules of a language modelled by an AHOS, the greatest behavioural (bi)conformance on its operational model forms a congruence. We illustrate our theory by deriving congruence of bisimilarity and behavioural pseudometrics for probabilistic higher-order languages.

Determination Problems for Orbit Closures and Matrix Groups

Computational problems concerning the orbit of a point under the action of a matrix group occur throughout computer science, including in program analysis, complexity theory, quantum computation, and automata theory. In many cases the focus extends beyond orbits proper to orbit closures under a suitable topology. Typically one starts from a group and a set of points and asks questions about the orbit closure of the set under the action of the group, e.g., whether two given orbit closures intersect.

In this paper we consider a collection of what we call determination problems concerning matrix groups and orbit closures. These problems begin with a given variety and seek to understand whether and how it arises either as an algebraic matrix group or as an orbit closure. The how question asks whether the underlying group is s-generated, meaning it is topologically generated by s matrices for a given number s. Among other applications, problems of this type have recently been studied in the context of synthesising loops subject to certain specified invariants on program variables.

Our main result is a polynomial-space procedure that inputs a variety and a number s and determines whether the given variety arises as an orbit closure of a point under an s-generated commutative algebraic matrix group. The main tools in our approach are structural properties of commutative algebraic matrix groups and module theory. We leave open the question of determining whether a variety is an orbit closure of a point under an s-generated algebraic matrix group (without the requirement of commutativity).

Domain-Theoretic Semantics for Functional Logic Programming

Functional Logic Programming (FLP) is a paradigm that extends higher-order functional programming with nondeterministic choice, logical variables, and equational constraints. Starting from the observation that these constructs can be presented as algebraic effects, we rationally reconstruct a core calculus for FLP that is based on call-by-push-value, and supports higher-order functions and recursion. We show how to execute its programs through an abstract machine that implements narrowing. Finally, we present a domain-theoretic semantics based on the lower powerdomain, which we prove to be sound, adequate, and fully abstract with respect to the machine. This leads to an exploration of the limitations of domain theory in modelling FLP.

Consistent Updates for Scalable Microservices

Online services are commonly implemented with a scalable microservice architecture, where isomorphic workers process client requests, recording persistent state in a backend data store. To maintain service, modifications to service functionality must be made on the fly -- i.e., as the service continues to process client requests -- but doing so is challenging. The central difficulty is that of avoiding inconsistencies from mixed-mode operation, caused by workers of current and new versions interacting via the data store. Some update methods avoid mixed-mode altogether, but only at the cost of substantial inefficiency -- by doubling resources (memory and compute), or by halving throughput. The alternative is an uncontrolled "rolling" update, which runs the risk of serious service failures arising from inconsistent mixed-mode behavior.

Ideally, it should appear to every client that a service update takes effect atomically; this ensures that a client is not exposed to inconsistent mixed-mode behavior. In this paper, we introduce a framework that formalizes this intuition and develop foundational theory for reasoning about update consistency. We apply this theory to derive the first algorithms that guarantee consistency for mixed-mode updates. The algorithms rely on semantic properties of service actions, such as commutativity. We show that this is unavoidable, by proving that any semantically oblivious mixed-mode update method must allow inconsistencies.

Zoo: A Framework for the Verification of Concurrent OCaml 5 Programs using Separation Logic

The release of Ocaml 5, which introduced parallelism in the OCaml runtime, drove the need for safe and efficient concurrent data structures. New libraries like Saturn address this need. This is an opportunity to apply and further state-of-the-art program verification techniques.

We present Zoo, a framework for verifying fine-grained concurrent OCaml 5 algorithms. Following a pragmatic approach, we defined a limited but sufficient fragment of the language to faithfully express these algorithms: ZooLang. We formalized its semantics carefully via a deep embedding in the Rocq proof assistant, uncovering subtle aspects of physical equality. We provide a tool to translate source OCaml programs into ZooLang syntax embedded inside Rocq, where they can be specified and verified using the Iris concurrent separation logic. To illustrate the applicability of Zoo, we verified a subset of the standard library and a collection of fined-grained concurrent data structures from the Saturn and Eio libraries.

In the process, we also extended OCaml to more efficiently express certain concurrent programs.

The Relative Monadic Metalanguage

Relative monads provide a controlled view of computation. We generalise the monadic metalanguage to a relative setting and give a complete semantics with strong relative monads. Adopting this perspective, we generalise two existing program calculi from the literature. We provide a linear-non-linear language for graded monads, LNL-RMM, along with a semantic proof that it is a conservative extension of the graded monadic metalanguage. Additionally, we provide a complete semantics for the arrow calculus, showing it is a restricted relative monadic metalanguage. This motivates the introduction of ARMM, a computational lambda calculus-style language for arrows that conservatively extends the arrow calculus.

Di- is for Directed: First-Order Directed Type Theory via Dinaturality

We show how dinaturality plays a central role in the interpretation of directed type theory where types are given by (1-)categories and directed equality by hom-functors. We introduce a first-order directed type theory where types are semantically interpreted as categories, terms as functors, predicates as dipresheaves, and proof-relevant entailments as dinatural transformation. This type theory is equipped with an elimination principle for directed equality, motivated by dinaturality, which closely resembles the J-rule used in Martin-Löf type theory. This directed J-rule comes with a simple syntactic restriction which recovers all theorems about symmetric equality, except for symmetry. Dinaturality is used to prove properties about transitivity (composition), congruence (functoriality), and transport (coYoneda) in exactly the same way as in Martin-Löf type theory, and allows us to obtain an internal "naturality for free". We then argue that the quantifiers of directed type theory should be ends and coends, which dinaturality allows us to capture formally. Our type theory provides a formal treatment to (co)end calculus and Yoneda reductions, which we use to give distinctly logical proofs to the (co)Yoneda lemma, the adjointness property of Kan extensions via (co)ends, exponential objects of presheaves, and the Fubini rule for quantifier exchange. Our main theorems are formalized in Agda.

Encode the Cake and Eat It Too: Controlling Computation in Type Theory, Locally

Proof assistants based on dependent type theory such as Agda, Lean and Rocq identify objects up to computation during proof checking. This takes away some of the proof burden from the user and even provides a way to get very efficient automation. Recently, Agda and Rocq have been extended to support user-defined computation. While they already prove very useful, user-defined computation rules are *global*: once they are added, they are here to stay. Importing a development that makes use of those rules then means relying on them, whether we want it or not, which can lead to unwanted incompatibilities. We design LRTT, a type theory with support for *local* abstraction over user-defined computation rules. This takes the form of a prenex quantification at the definition level. This quantification is supplemented with the possibility to provide one or several instantiations that verify the equations definitionally. We show that a procedure inlining definitions abstracting over definitional equality is possible, in the style of monomorphisation or of C++ templates. In the process we get a conservativity result over more conventional Martin-Löf type theories. There are several benefits to such a system. First, it provides encapsulation for user-defined computation rules, which is important to avoid unwanted bad interactions and limits the scope in which invariants of type theory (such as termination, confluence, type preservation and consistency) are broken. Second, abstraction lets users factorise code that crucially relies on definitional equality, as well as hide implementation details that are irrelevant in some settings. Finally, it gives a way to encode certain features without paying the price of the encoding. We showcase such examples in a prototype implementation as an extension of the Rocq Prover. Additionally, all the results in this have been formalised in Rocq.

DafnyMPI: A Dafny Library for Verifying Message-Passing Concurrent Programs

The Message Passing Interface (MPI) is widely used in parallel, high-performance programming, yet writing bug-free software that uses MPI remains difficult. We introduce DafnyMPI, a novel, scalable approach to formally verifying MPI software. DafnyMPI allows proving deadlock freedom, termination, and functional equivalence with simpler sequential implementations. In contrast to existing specialized frameworks, DafnyMPI avoids custom concurrency logics and instead relies on Dafny, a verification-ready programming language used for sequential programs, extending it with concurrent reasoning abilities. DafnyMPI is implemented as a library that enables safe MPI programming by requiring users to specify the communication topology upfront and to verify that calls to communication primitives such as MPI_ISEND and MPI_WAIT meet their preconditions. We formalize DafnyMPI using a core calculus and prove that the preconditions suffice to guarantee deadlock freedom. Functional equivalence is proved via rely-guarantee reasoning over message payloads and a system that guarantees safe use of read and write buffers. Termination and the absence of runtime errors are proved using standard Dafny techniques. To further demonstrate the applicability of DafnyMPI, we verify numerical solutions to three canonical partial differential equations. We believe DafnyMPI demonstrates how to make formal verification viable for a broader class of programs and provides proof engineers with additional tools for software verification of parallel and concurrent systems.

An Equational Axiomatization of Dynamic Threads via Algebraic Effects: Presheaves on Finite Relations, Labelled Posets, and Parameterized Algebraic Theories

We use the theory of algebraic effects to give a complete equational axiomatization for dynamic threads. Our method is based on parameterized algebraic theories, which give a concrete syntax for strong monads on functor categories, and are a convenient framework for names and binding.

Our programs are built from the key primitives 'fork' and 'wait'. 'Fork' creates a child thread and passes its name (thread ID) to the parent thread. 'Wait' allows us to wait for given child threads to finish. We provide a parameterized algebraic theory built from fork and wait, together with basic atomic actions and laws such as associativity of 'fork'.

Our equational axiomatization is complete in two senses. First, for closed expressions, it completely captures equality of labelled posets (pomsets), an established model of concurrency: model complete. Second, any two open expressions are provably equal if they are equal under all closing substitutions: syntactically complete.

The benefit of algebraic effects is that the semantic analysis can focus on the algebraic operations of fork and wait. We then extend the analysis to a simple concurrent programming language by giving operational and denotational semantics. The denotational semantics is built using the methods of parameterized algebraic theories and we show that it is sound, adequate, and fully abstract at first order for labelled-poset observations.

A Logic for the Imprecision of Abstract Interpretations

In numerical analysis, error propagation refers to how small inaccuracies in input data or intermediate computations accumulate and affect the final result, typically governed by the stability and sensitivity of the algorithm with respect to some perturbations. The definition of a similar concept in approximated program analysis is still a challenge. In abstract interpretation, inaccuracy arises from the abstraction itself, and the propagation of this error is dictated by the abstract interpreter. In most cases, such imprecision is inevitable. In this paper we introduce a logic for deriving (upper) bounds on the inaccuracy of an abstract interpretation. We are able to derive a function that bounds the imprecision of the result of an abstract interpreter from the imprecision of its input data. When this holds we have what we call partial local completeness of the abstract interpreter, a weaker form of completeness known in the literature. To this end, we introduce the notion of a generator for a property represented in the abstract domain. Generators allow us to restrict the search space when verifying whether the bounding function holds for a given program and input. We then introduce a program logic, called Error Propagation Logic (EPL), for propagating the error bounds produced by an abstract interpretation. This logic is a combination of correctness and incorrectness logics and a logic for program ω-continuity that is also introduced in this paper.

ChopChop: A Programmable Framework for Semantically Constraining the Output of Language Models

Language models (LMs) can generate code but cannot guarantee its correctness—often producing outputs that violate type safety, program invariants, or other semantic properties. Constrained decoding offers a solution by restricting generation to only produce programs that satisfy user-defined properties. However, existing methods are either limited to syntactic constraints or rely on brittle, ad hoc encodings of semantic properties over token sequences rather than program structure.

We present ChopChop, the first programmable framework for constraining the output of LMs with respect to semantic properties. ChopChop introduces a principled way to construct constrained decoders based on analyzing the space of programs a prefix represents. It formulates this analysis as a realizability problem which is solved via coinduction, connecting token-level generation with structural reasoning over programs. We demonstrate ChopChop's generality by using it to enforce (1) equivalence to a reference program and (2) type safety. Across a range of models and tasks, ChopChop improves success rates while maintaining practical decoding latency.

Piecewise Analysis of Probabilistic Programs via 𝑘-Induction

In probabilistic program analysis, quantitative analysis aims at deriving tight numerical bounds for probabilistic properties such as expectation and assertion probability. Most previous works consider numerical bounds over the whole program state space monolithically and do not consider piecewise bounds. Not surprisingly, monolithic bounds are either conservative, or not expressive and succinct enough in general. To derive better bounds, we propose a novel approach for synthesizing piecewise bounds over probabilistic programs. First, we show how to extract useful piecewise information from latticed k-induction operators, and combine the piecewise information with Optional Stopping Theorem to obtain a general approach to derive piecewise bounds over probabilistic programs. Second, we develop algorithms to synthesize piecewise polynomial bounds, and show that the synthesis can be reduced to bilinear programming in the linear case, and soundly relaxed to semidefinite programming in the polynomial case. Experimental results show that our approach generates tight piecewise bounds for a wide range of benchmarks when compared with the state of the art.

Formal Verification for JavaScript Regular Expressions: A Proven Mechanized Semantics and Its Applications

We present the first mechanized, succinct, practical, complete, and proven-faithful semantics for a modern regular expression language with backtracking semantics. We ensure its faithfulness by proving it equivalent to a preexisting line-by-line embedding of the official ECMAScript specification of JavaScript regular expressions. We demonstrate its practicality by presenting two real-world applications. First, a new notion of contextual equivalence for modern regular expressions, which we use to prove or disprove rewrites drawn from previous work. Second, the first formal proof of the PikeVM algorithm used in many real-world engines. In contrast with the specification and other formalization work, our semantics captures not only the top-priority match, but a full backtracking tree recording all possible matches and their respective priority. All our definitions and results have been mechanized in the Rocq proof assistant.

Lazy Linearity for a Core Functional Language

Traditionally, in linearly typed languages, consuming a linear resource is synonymous with its syntactic occurrence in the program. However, under the lens of non-strict evaluation, linearity can be further understood semantically, where a syntactic occurrence of a resource does not necessarily entail using that resource when the program is executed. While this distinction has been largely unexplored, it turns out to be inescapable in Haskell's optimising compiler, which heavily rewrites the source program in ways that break syntactic linearity but preserve the program's semantics. We introduce Linear Core, a novel system which accepts the lazy semantics of linearity statically and is suitable for lazy languages such as the Core intermediate language of the Glasgow Haskell Compiler. We prove that Linear Core is sound, guaranteeing linear resource usage, and that multiple optimising transformations preserve linearity in Linear Core while failing to do so in Core. We have implemented Linear Core as a compiler plugin to validate the system against linearity-heavy libraries, including linear-base.

Parameterized Verification of Quantum Circuits

We present the first fully automatic framework for verifying relational properties of parameterized quantum programs, i.e., a program that, given an input size, generates a corresponding quantum circuit. We focus on verifying input-output correctness as well as equivalence. At the core of our approach is a new automata model, synchronized weighted tree automata (SWTAs), which compactly and precisely captures the infinite families of quantum states produced by parameterized programs. We introduce a class of transducers to model quantum gate semantics and develop composition algorithms for constructing transducers of parameterized circuits. Verification is reduced to functional inclusion or equivalence checking between SWTAs, for which we provide decision procedures. Our implementation demonstrates both the expressiveness and practical efficiency of the framework by verifying a diverse set of representative parameterized quantum programs with verification times ranging from milliseconds to seconds

A Verified High-Performance Composable Object Library for Remote Direct Memory Access

Remote Direct Memory Access (RDMA) is a memory technology that allows remote devices to directly write to and read from each other's memory, bypassing components such as the CPU and operating system. This enables low-latency high-throughput networking, as required for many modern data centres, HPC applications and AI/ML workloads. However, baseline RDMA comprises a highly permissive weak memory model that is difficult to use in practice and has only recently been formalised.

In this paper, we introduce the Library of Composable Objects (LOCO), a formally verified library for building multi-node objects on RDMA, filling the gap between shared memory and distributed system programming. LOCO objects are well-encapsulated and take advantage of the strong locality and the weak consistency characteristics of RDMA. They have performance comparable to custom RDMA systems (e.g. distributed maps), but with a far simpler programming model amenable to formal proofs of correctness.

To support verification, we develop a novel modular declarative verification framework, called Mowgli, that is flexible enough to model multinode objects and is independent of a memory consistency model. We instantiate Mowgli with the RDMA memory model, and use it to verify correctness of LOCO libraries.

A Family of Sims with Diverging Interests

Simulations are widely-used notions of program refinement. This paper discusses and compares several of them, in particular notions of simulation that are both weak and sensitive to divergence. Complex simulation proofs performed in proof assistants, for instance in a verified compilation setting, often rely on variants of normed simulation, which is not complete with respect to divergence-sensitive weak simulation. We propose to bridge this gap with µdiv-simulation, a novel notion of simulation that is equivalent to classical divergence-sensitive weak simulation, and designed to be as comfortable to use as modern characterizations of normed simulation. We then define a parameterized notion of simulation that covers strong simulation, weak simulation, µdiv-simulation, and 9 more notions of simulation, and jointly establish various "up-to" reasoning techniques for these 12 notions. Our results are formalized in Rocq and instantiated on two case studies: Choice Trees and a CompCert pass. Verified compilation is a major motivation for our study, but because we work with an abstract LTS setting, our results are also relevant to other fields that make use of divergence-sensitive weak simulation, such as model checking.

Classical Notions of Computation and the Hasegawa-Thielecke Theorem

In the spirit of the Curry-Howard correspondence between proofs and programs, we define and study a syntax and semantics for classical logic equipped with a computationally involutive negation, using a polarised effect calculus, the linear classical L-calculus. A main challenge in designing a denotational semantics for the calculus is to accommodate both call-by-value and call-by-name evaluation strategies, which leads to a failure of associativity of composition. In order to tackle this issue, we define a notion of adjunction between graph morphisms on non-associative categories, which we use to formulate polarized and non-associative notions of symmetric monoidal closed duploid and of dialogue duploid. We show that they provide a direct style counterpart to adjunction models: linear effect adjunctions for the (linear) call-by-push-value calculus and dialogue chiralities for linear continuations, respectively. In particular, we show that the syntax of the linear classical L-calculus can be interpreted in any dialogue duploid, and that it defines in fact a syntactic dialogue duploid. As an application, we establish, by semantic as well as syntactic means, the Hasegawa-Thielecke theorem, which states that the notions of central map and of thunkable map coincide in any dialogue duploid (in particular, for any double negation monad on a symmetric monoidal category).

Bounded Treewidth, Multiple Context-Free Grammars, and Downward Closures

The reachability problem in multi-pushdown automata (MPDA), or equivalently, interleaved Dyck reachability, has many applications in static analysis of recursive programs. An example is safety verification of multi-threaded recursive programs with shared memory. Since these problems are undecidable, the literature contains many decidable (and efficient) underapproximations of MPDA.

A uniform framework that captures many of these underapproximations is that of bounded treewidth: To each execution of the MPDA, we associate a graph; then we consider the subset of all graphs that have a treewidth at most k, for some constant k. In fact, bounding treewidth is a generic approach to obtain classes of systems with decidable reachability, even beyond MPDA underapproximations. The resulting systems are also called MSO-definable bounded-treewidth systems.

While bounded treewidth is a powerful tool for reachability and similar types of analysis, the word languages (i.e. action sequences corresponding to executions) of these systems remain far from understood. For the slight restriction of bounded special treewidth, or “bounded-stw” (which is equivalent to bounded treewidth on MPDA, and even includes all bounded-treewidth systems studied in the literature), this work reveals a connection with multiple context-free languages (MCFL), a concept from computational linguistics. We show that the word languages of MSO-definable bounded-stw systems are exactly the MCFL.

We exploit this connection to provide an optimal algorithm for computing downward closures for MSO-definable bounded-stw systems. Computing downward closures is a notoriously difficult task that has many applications in the verification of complex systems: As an example application, we show that in programs with dynamic spawning of MSO-definable bounded-stw processes, safety verification has the same complexity as in the case of processes with sequential recursive processes.

Oriented Metrics for Bottom-Up Enumerative Synthesis

In syntax-guided synthesis, one of the challenges is to reduce the enormous size of the search space. We observe that most search spaces are not just flat sets of programs, but can be endowed with a structure that we call an oriented metric. Oriented metrics measure the distance between programs, like ordinary metrics do, but are designed for settings in which operations have an orientation. Our focus is on the string and the bitvector domains, where operations like concatenation and bitwise conjunction transform an input into an output in a way that is not symmetric. We develop several new oriented metrics for these domains.

Oriented metrics are designed for search space reduction, and we present four techniques: (i) pruning the search space to a ball around the ground truth, (ii) factorizing the search space by an equivalence that is induced by the oriented metric, (iii) abstracting the oriented metric (and hence the equivalence) and refining it, and (iv) improving the enumeration order by learning from abstract information. We acknowledge that these techniques are inspired by developments in the literature. By understanding their roots in oriented metrics, we can substantially increase their applicability and efficiency. We have integrated these techniques into a new synthesis algorithm and implemented the algorithm in a new solver. Notably, our solver is generic in the oriented metric over which it computes. We conducted experiments in the string and the bitvector domains, and consistently improve the performance over the state-of-the-art by more than an order of magnitude.

Big-Stop Semantics: Small-Step Semantics in a Big-Step Judgment

As is evident in the programming language literature, many practitioners favor specifying dynamic program behavior using big-step over small-step semantics. Unlike small-step semantics, which must dwell on every intermediate program state, big-step semantics conveniently jumps directly to the ever-important result of the computation. Big-step semantics also typically involves fewer inference rules than their small-step counterparts. However, in exchange for ergonomics, big-step semantics gives up power: Small-step semantics describes program behaviors that are outside the grasp of big-step semantics, notably divergence.

This work presents a little-known extension of big-step semantics with inductive definitions that captures diverging computations without introducing error states. This big-stop semantics is illustrated for typed, untyped, and effectful variants of PCF. Big-stop semantics extends the standard big-step inference rules with a few additional rules to define an evaluation judgment that is equivalent to the reflexive-transitive closure of small-step transitions. This simple extension contrasts with other solutions in the literature that sacrifice ergonomics by introducing many additional inference rules, global state, and/or less-commonly-understood reasoning principles like coinduction. The ergonomics of big-stop semantics is exemplified via concise Agda proofs for some key results and compilation theorems.

Foundational Multi-Modal Program Verifiers

Multi-modal program verification is a process of validating code against its specification using both dynamic and symbolic techniques, and proving its correctness by a combination of automated and interactive machine-assisted tools. In order to be trustworthy, such verification tools must themselves come with formal soundness proofs, establishing that any program verified in them against a certain specification does not violate the specification's statement when executed. Verification tools that are proven sound in a general-purpose proof assistant with a small trusted core are commonly referred to as foundational.

We present a framework that facilitates and streamlines construction of program verifiers that are both foundational and multi-modal. Our approach adopts the well-known idea of monadic shallow embedding of an executable program semantics into the programming language of a theorem prover based on higher-order logic, in our case, the Lean proof assistant. We provide a library of monad transformers for such semantics, encoding a variety of computational effects, including state, divergence, exceptions, and non-determinism. The key theoretical innovation of our work are monad transformer algebras that enable automated derivation of the respective sound verification condition generators. We show that proofs of the resulting verification conditions enjoy automation using off-the-shelf SMT solvers and allow for an interactive proof mode when automation fails. To demonstrate versatility of our framework, we instantiated it to embed two foundational multi-modal verifiers into Lean for reasoning about (1) distributed protocol safety and (2) Dafny-style specifications of imperative programs, and used them to mechanically verify a number of non-trivial case studies.

Generating Compilers for Qubit Mapping and Routing

To evaluate a quantum circuit on a quantum processor, one must find a mapping from circuit qubits to processor qubits and plan the instruction execution while satisfying the processor's constraints. This is known as the qubit mapping and routing (QMR) problem. High-quality QMR solutions are key to maximizing the utility of scarce quantum resources and minimizing the probability of logical errors affecting computation. The challenge is that the landscape of quantum processors is incredibly diverse and fast-evolving. Given this diversity, dozens of papers have addressed the QMR problem for different qubit hardware, connectivity constraints, and quantum error correction schemes by a developing a new algorithm for a particular context. We present an alternative approach: automatically generating qubit mapping and routing compilers for arbitrary quantum processors. Though each QMR problem is different, we identify a common core structure—device state machine—that we use to formulate an abstract QMR problem. Our formulation naturally leads to a compact domain-specific language for specifying QMR problems and a powerful parametric algorithm that can be instantiated for any QMR specification. Our thorough evaluation on case studies of important QMR problems shows that generated compilers are competitive with handwritten, specialized compilers in terms of runtime and solution quality.

Welterweight Go: Boxing, Structural Subtyping, and Generics

Go’s unique combination of structural subtyping between generics and types with non-uniform runtime representations presents significant challenges for formalising the language. We introduce WG (Welterweight Go), a core model of Go that captures key features excluded by prior work, including underlying types, type constraints and type sets, and proposed new features, such as generic methods. We also develop LWG, a lower-level language that models Go’s runtime mechanisms, notably the distinction between raw struct values and interface values that carry runtime type information (RTTI). We give a type-directed compilation from WG to LWG that demonstrates how the proposed features can be implemented while observing important design and implementation goals for Go: compatibility with separate compilation, and no runtime code generation. Unlike existing approaches based on static monomorphisation, our compilation strategy uses runtime type conversions and adaptor methods to handle the complex interactions between structural subtyping, generics, and Go’s runtime infrastructure.

Nice to Meet You: Synthesizing Practical MLIR Abstract Transformers

Static analyses play a fundamental role during compilation: they discover facts that are true in all executions of the code being compiled, and then these facts are used to justify optimizations and diagnostics. Each static analysis is based on a collection of abstract transformers that provide abstract semantics for the concrete instructions that make up a program. It can be challenging to implement abstract transformers that are sound, precise, and efficient—and in fact both LLVM and GCC have suffered from miscompilations caused by unsound abstract transformers. Moreover, even after more than 20 years of development, LLVM lacks abstract transformers for hundreds of instructions in its intermediate representation (IR). We developed NiceToMeetYou: a program synthesis framework for abstract transformers that are aimed at the kinds of non-relational integer abstract domains that are heavily used by today’s production compilers. It exploits a simple but novel technique for breaking the synthesis problem into parts: each of our transformers is the meet of a collection of simpler, sound transformers that are synthesized such that each new piece fills a gap in the precision of the final transformer. Our design point is bulk automation: no sketches are required. Transformers are verified by lowering to a previously-created SMT dialect of MLIR. Each of our synthesized transformers is provably sound and some (17 %) are more precise than those provided by LLVM.

Counting and Sampling Traces in Regular Languages

In this work, we study the fundamental problems of counting and sampling traces that a regular language touches. Formally, one fixes the alphabet Σ and an independence relation I ⊆ Σ × Σ. The computational problems we address take as input a regular language L over Σ, presented as a finite automaton with m states, together with a natural number n (presented in unary). For the counting problem, the output is the number of Mazurkiewicz traces (induced by I) that intersect the nth slice Ln = L ∩ Σn of L, i.e., traces that have at least one linearization in Ln. For the sampling problem, the output is a trace drawn from a distribution that is approximately uniform over all such traces. These problems are motivated by applications such as bounded model checking based on partial-order reduction, where an a priori estimate of the size of the state space can significantly improve usability, as well as testing approaches for concurrent programs that use partial-order-aware random sampling, where uniform exploration is desirable for effective bug detection.

We first show that the counting problem is #P-hard even when the automaton accepting the language L is deterministic, which is in sharp contrast to the corresponding problem for counting the words of a DFA, which is solvable in polynomial time. We then show that the counting problem remains in the class #P for both NFAs and DFAs, independent of whether L is trace-closed. Finally, our main contributions are a fully polynomial-time randomized approximation scheme (FPRAS) that, with high probability, estimates the desired count within a specified accuracy parameter, and a fully polynomial-time almost uniform sampler (FPAUS) that generates traces while ensuring that the distribution induced on them is approximately uniform with high probability.

A Complementary Approach to Incorrectness Typing

We introduce a new two-sided type system for verifying the correctness and incorrectness of functional programs with atoms and pattern matching. A key idea in the work is that types should range over sets of normal forms, rather than sets of values, and this allows us to define a complement operator on types that acts as a negation on typing formulas. We show that the complement allows us to derive a wide range of refutation principles within the system, including the type-theoretic analogue of co-implication, and we use them to certify that a number of Erlang-like programs go wrong. An expressive axiomatisation of the complement operator via subtyping is shown decidable, and the type system as a whole is shown to be not only sound, but also complete for normal forms.

From Semantics to Syntax: A Type Theory for Comprehension Categories

Recent models of intensional type theory have been constructed in algebraic weak factorization systems (AWFSs). AWFSs give rise to comprehension categories that feature non-trivial morphisms between types; these morphisms are not used in the standard interpretation of Martin-Löf type theory in comprehension categories.

We develop a type theory that internalizes morphisms between types, reflecting this semantic feature back into syntax. Our type theory comes with Π-, Σ-, and identity types. We discuss how it can be viewed as an extension of Martin-Löf type theory with coercive subtyping, as sketched by Coraglia and Emmenegger. We furthermore define semantic structure that interprets our type theory and prove a soundness result. Finally, we exhibit many examples of the semantic structure, yielding a plethora of interpretations.

Parameterized Infinite-State Reactive Synthesis

We propose a method to synthesize a parameterized infinite-state system that can be instantiated for different parameter values. The specification is given in a parameterized temporal logic that allows for data variables as well as parameters that encode properties of the environment. Our synthesis method runs in a counterexample-guided loop consisting of four steps: (1) we synthesize concrete systems for some small parameter instantiations using existing techniques. (2) We generalize the concrete systems into a parameterized program. (3) We create a proof candidate consisting of an invariant and a ranking function. (4) We check the proof candidate for consistency with the program. If the proof succeeds, the parameterized program is valid. Otherwise, we identify a parameter value for which it fails and add a new concrete instance to step one. To generalize programs and create proof candidates, we use a combination of anti-unification and syntax-guided synthesis to express the differences between the programs as a function of the parameters. We evaluate our approach on new examples and examples from the literature that are manually parameterized.

What Is a Monoid?

In many situations one encounters an entity that resembles a monoid. It consists of a carrier and two operations that resemble a unit and a multiplication, subject to three equations that resemble associativity and left and right unital laws. The question then arises whether this entity is, in fact, a monoid in a suitable sense.

Category theorists have answered this question by providing a notion of monoid in a monoidal category, or more generally in a multicategory. While these encompass many examples, there remain cases which do not fit into these frameworks, such as the notion of relative monad and the modelling of call-by-push-value sequencing. In each of these examples, the leftmost and/or the rightmost factor of a multiplication or associativity law seems to be distinguished.

To include such examples, we generalize the multicategorical framework in two stages.

Firstly, we move to the framework of a left-skew multicategory (due to Bourke and Lack), which generalizes both multicategory and left-skew monoidal category. The notion of monoid in this framework encompasses examples where only the leftmost factor is distinguished, such as the notion of relative monad.

Secondly, we consider monoids in the novel framework of a bi-skew multicategory. This encompasses examples where both the leftmost and the rightmost factor are distinguished, such as the notion of a category on a span, and the modelling of call-by-push-value sequencing.

In the bi-skew framework (which is the most general), we give a coherence result saying that a monoid corresponds to an unbiased monoid, i.e. a map from the terminal bi-skew multicategory.

Stateful Differential Operators for Incremental Computing

Differential operators map input changes to output changes and form the building blocks of efficient incremental computations. For example, differential operators for relational algebra are used to perform live view maintenance in database systems. However, few differential operators are known and it is unclear how to develop and verify new efficient operators. In particular, we found that differential operators often need to use internal state to selectively cache relevant information, which is not supported by prior work. To this end, we designed a specification for stateful differential operators that allows custom state, yet places sufficient constraints to ensure correctness. We model our specification in Rocq and show that the specification not only guides the design of novel differential operators, but also can capture some of the most sophisticated existing differential operators: database join and Datalog aggregation. We show how to describe complex incremental computations in OCaml by composing stateful differential operators, which we have extracted from Rocq.

Probabilistic Programming with Vectorized Programmable Inference

We present GenJAX, a new language and compiler for vectorized programmable probabilistic inference. GenJAX integrates the vectorizing map (vmap) operation from array programming frameworks such as JAX into the programmable inference paradigm, enabling compositional vectorization of features such as probabilistic program traces, stochastic branching (for expressing mixture models), and programmable inference interfaces for writing custom probabilistic inference algorithms. We formalize vectorization as a source-to-source program transformation on a core calculus for probabilistic programming (λGEN), and prove that it correctly vectorizes both modeling and inference operations. We have implemented our approach in the GenJAX language and compiler, and have empirically evaluated this implementation on several benchmarks and case studies. Our results show that our implementation supports a wide and expressive set of programmable inference patterns and delivers performance comparable to hand-optimized JAX code.

Cryptis: Cryptographic Reasoning in Separation Logic

We introduce Cryptis, an extension of the Iris separation logic for the symbolic model of cryptography. The combination of separation logic and cryptographic reasoning allows us to prove the correctness of a protocol and later reuse this result to verify larger systems that rely on the protocol. To make this integration possible, we propose novel specifications for authentication protocols that allow agents in a network to agree on the use of system resources. We evaluate our approach by verifying various authentication protocols and a key-value store server that uses these authentication protocols to connect to clients. Our results are formalized in Rocq.

Quantum Circuits Are Just a Phase

Quantum programs today are written at a low level of abstraction - quantum circuits akin to assembly languages - and the unitary parts of even advanced quantum programming languages essentially function as circuit description languages. This state of affairs impedes scalability, clarity, and support for higher-level reasoning. More abstract and expressive quantum programming constructs are needed.

To this end, we introduce a simple syntax for generating unitaries from "just a phase"; we combine a (global) phase operation that captures phase shifts with a quantum analogue of the "if let" construct that captures subspace selection via pattern matching. This minimal language lifts the focus from gates to eigendecomposition, conjugation, and controlled unitaries; common building blocks in quantum algorithm design.

We demonstrate several aspects of the expressive power of our language in several ways. Firstly, we establish that our representation is universal by deriving a universal quantum gate set. Secondly, we show that important quantum algorithms can be expressed naturally and concisely, including Grover's search algorithm, Hamiltonian simulation, Quantum Fourier Transform, Quantum Signal Processing, and the Quantum Eigenvalue Transformation. Furthermore, we give clean denotational semantics grounded in categorical quantum mechanics. Finally, we implement a prototype compiler that efficiently translates terms of our language to quantum circuits, and prove that it is sound with respect to these semantics. Collectively, these contributions show that this construct offers a principled and practical step toward more abstract and structured quantum programming.

Bounded Sort Polymorphism with Elimination Constraints

Proof assistants based on dependent type theory---such as Agda, Lean, and Rocq---employ different universes to classify types, typically combining a predicative tower for computationally relevant types with a possibly impredicative universe for proof-irrelevant propositions. Several other universes with specific logical and computational principles have been explored in the literature. In general, a universe is characterized by its sort (e.g., Type, Prop, or SProp) and, in the predicative case, by its level. To improve modularity and better avoid code duplication, sort polymorphism has recently been introduced and integrated in the Rocq prover.

However, we observe that, due to its unbounded formulation, sort polymorphism is currently insufficiently expressive to abstract over valid definitions with a single polymorphic schema. Indeed, to ensure soundness of a multi-sorted type theory, the interaction between different sorts must be carefully controlled, as exemplified by the forbidden elimination of irrelevant terms to produce relevant ones. As a result, generic functions that eliminate values of inductive types from one sort to another cannot be made polymorphic; dually, polymorphic records that encapsulate attributes of different sorts cannot be defined. This lack of expressiveness also breaks the possibility to infer principal types, which is highly desirable for both metatheoretical and practical reasons. To address these issues, we extend sort polymorphism with bounds that reflect the required elimination constraints on sort variables. We present the metatheory of bounded sort polymorphism, paying particular attention to the consistency of the resulting constraint graph. We implement bounded sort polymorphism in Rocq and illustrate its benefits through concrete examples. Bounded sort polymorphism with elimination constraints is a natural and general solution that effectively addresses current limitations and fosters the development of, and practical experimentation with, multi-sorted type theories.

Coco: Corecursion with Compositional Heterogeneous Productivity

Contemporary proof assistants impose restrictive syntactic guardedness conditions that reject many valid corecursive definitions. Existing approaches to overcome these restrictions present a fundamental trade-off between coverage and automation.

We present Compositional Heterogeneous Productivity (CHP), a theoretical framework that unifies high automation with extensive coverage for corecursive definitions. CHP introduces heterogeneous productivity applicable to functions with diverse domain and codomain types, including non-coinductive types. Its key innovation is compositionality: the productivity of composite functions is systematically computed from their components, enabling modular reasoning about complex corecursive patterns.

Building on CHP, we develop Coco, a corecursion library for Rocq that provides extensive automation for productivity computation and fixed-point generation.

SECTION: Corrections

Corrigendum: Unrealizability Logic

This is a corrigendum for the article "Unrealizability Logic" by Jinwoo Kim, Loris D’Antoni, and Thomas Reps, published in Proc. ACM Program. Lang. 7, POPL, Article 23 (January 2023), https://doi.org/10.1145/3571216. The authors, with the help of Shaan Nagy, discovered that there was an implicitly made assumption when stating soundness, and a flaw in the completeness proof of the original paper. This corrigendum clarifies the assumption made in the soundness statement and rectifies the completeness proof.