Continuations are a useful concept to master, but their support in programming languages is not necessarily accessible to beginners. We aim to improve the learning experience of continuations, in particular delimited continuations, by finding beginner-friendly designs of continuation support. For this purpose, we develop a pedagogical language Pyret-cont and teach a graduate course focused on continuations. Our experience would provide insights to developers and educators wishing to support and teach continuations.
In Formal Languages and Automata Theory courses, students are exposed to context-free grammars. They are expected to learn how to develop grammars and how to derive words. Despite multiple classroom and textbook examples, some students find grammar design and word derivation difficult due to nondeterminism. A modern pedagogy uses a programming-based approach to introduce students to context-free grammars. Using this methodology, students can design, implement, validate, and verify context-free grammars. However, they find the design and development task challenging. This article presents a novel dynamic visualization tool to help students with grammar design, implementation, and verification. The tool presents the user with the stepwise construction of a derivation tree and with support to visualize whether nonterminal invariants hold. Empirical data from a small formative study suggests that students find the visualization tool useful to understand word derivation, to debug grammars, and to develop correctness proofs.
Proof techniques are a core component of computer science (CS) education, yet many CS students struggle to engage with and understand proof-based material. Interactive Theorem Provers (ITPs), originally developed for formal verification, have emerged as promising educational tools that offer structured feedback and align well with CS students’ existing technical skills. While recent work has begun to explore the use of ITPs in educational settings, a notable gap remains: little is known about how well these tools support the range of proof techniques taught in CS curricula or how they might be improved to do so. We specifically seek to close this gap and contribute to the growing literature on ITPs in education through three concrete efforts: a user study of student experiences with the Coq proof assistant, a case study comparing proof development across Coq, Lean, and traditional methods, and a heuristic evaluation of each ITP using Nielsen’s usability heuristics.
Our results show that while the two ITPs can support the development of proof skills, they also present usability challenges --- such as complex syntax and unclear error messaging --- that can hinder learning. We also find that formalized ITP proofs tend to be more explicit and verbose than pen-and-paper proofs, which can affect students’ perception of proof difficulty. Our heuristic evaluation highlights specific areas for improvement in Coq and Lean, including clearer error messages, support for example-driven learning, and expanding proof libraries to better align with educational needs. Together, these findings provide actionable insights for instructors considering ITPs for education and highlight both the pedagogical benefits and current limitations of these systems, as well as opportunities for researchers to improve ITPs as educational tools.
How does Generative AI (GenAI) impact how students work and collaborate in a software engineering course? To explore this question, we conducted an exploratory study in a project-based course where students developed three versions of a system across agile sprints, with unrestricted access to GenAI tools. From survey responses of 349 students, we found that the technology was used extensively with 84% of students reporting use and 90% of them finding the technology useful. Through semi-structured interviews with 24 of the students, we delved deeper, learning that students used GenAI pervasively, not only to generate code but also to validate work retrospectively, such as checking alignment with requirements and design after implementation had begun. Students often turned to GenAI as their first point of contact, even before consulting teammates, which reduced direct interpersonal collaboration. These results suggest the need for new pedagogical strategies that address not just individual tool use, but also design reasoning and collaborative practices in GenAI-augmented teams.
When solving a problem through programming, we start with information modeling, i.e., representing the information in the problem description as data in the programming language. Information modeling plays an essential role in program development, but it can be challenging for novice programmers. The main obstacles include the lack of clear instructions on the process and the need for knowledge of programming language syntax.
We aim to support novices in acquiring information modeling skills. To achieve this goal, we define information modeling as a three-step process and develop Daisy, a block-based exercise environment for practicing information modeling. We also conducted a preliminary user study to investigate students' behavior and performance in information modeling, as well as to collect their opinions on Daisy's design. Overall, we received positive reactions, along with insights and suggestions for future improvements.
Program design skills can be effectively acquired by following an explicit guideline that systematizes the process of designing programs. While such guidelines exist, novice students often struggle to follow them without instructor support.
We propose an interactive learning environment for practicing program design based on the design recipe. Backed by an LLM, the environment guides the student through the design process via conversations in natural language, while automatically generating code and feedback based on the student’s responses. In this paper, we describe the design of the environment and report the results of a user study, through which we observed both the flexibility and challenges of our LLM-based approach.
Software developers have long emphasized the need for clear textual descriptions of programs, through documentation and comments. Similarly, curricula often expect students to write purpose statements that articulate in prose what program components are expected to do. Unfortunately, it is difficult to motivate students to do this and to evaluate student work at scale.
We leverage the use of a large language model for this purpose. Specifically, we describe a tool, Porpoise, that presents students with problem descriptions, passes their textual descriptions to a large language model to generate code, evaluates the result against tests, and gives students feedback. Essentially, it gives students practice writing quality purpose statements, and simultaneously also getting familiar with zero-shot prompting in a controlled manner.
We present the tool’s design as well as the experience of deploying it at two universities. This includes asking students to reflect on trade-offs between programming and zero-shot prompting, and seeing what difference it makes to give students different formats of problem descriptions. We also examine affective and load aspects of using the tool. Our findings are somewhat positive but mixed.
The need for scalable and personalized content in programming education is driving interest in automating exercise generation. This requires a clear understanding of existing exercises. Our research addresses this by classifying existing exercises by topic and difficulty level. We combine a lexicon-based analysis with machine learning and advanced natural language processing techniques, providing a foundation for AI-assisted content generation. Specifically, we utilize BERTopic for topic modeling and five machine learning models to predict difficulty levels in programming exercises. Our dataset includes 106 programming exercise descriptions from three introductory courses, plus performance data from up to 189 learners. The results demonstrate that lexicon-based approaches significantly improve topic modeling accuracy and coherence compared to the baseline, with reduced variance and more consistent cluster stability. Although difficulty prediction remains challenging due to the complexity of defining ground truth, lexicon integration leads to modest yet consistent performance gains. This work lays an essential groundwork for scalable and resource-efficient solutions for the classification and generation of personalized programming exercises.