Anton Golov

A1: (not assigned)

CQS: A Formally-Verified Framework for Fair and Abortable Synchronization Nikita Koval, Dmitry Khalanskiy, Dan Alistarh PLDI 2023 Writing concurrent code that is both correct and efficient is notoriously difficult. Thus, programmers often prefer to use synchronization abstractions, which render code simpler and easier to reason about. Despite a wealth of work on this topic, there is still a gap between the rich semantics provided by synchronization abstractions in modern programming languages—specifically, fair FIFO ordering of synchronization requests and support for abortable operations—and frameworks for implementing it correctly and efficiently. Supporting such semantics is critical given the rising popularity of constructs for asynchronous programming, such as coroutines, which abort frequently and are cheaper to suspend and resume compared to native threads. This paper introduces a new framework called CancellableQueueSynchronizer (CQS), which enables simple yet efficient implementations of a wide range of fair and abortable synchronization primitives: mutexes, semaphores, barriers, count-down latches, and blocking pools. Our main contribution is algorithmic, as implementing both fairness and abortability efficiently at this level of generality is non-trivial. Importantly, all our algorithms, including the CQS framework and the primitives built on top of it, come with formal proofs in the Iris framework for Coq for many of their properties. These proofs are modular, so it is easy to show correctness for new primitives implemented on top of CQS. From a practical perspective, implementation of CQS for native threads on the JVM improves throughput by up to two orders of magnitude over Java's AbstractQueuedSynchronizer, the only practical abstraction offering similar semantics. Further, we successfully integrated CQS as a core component of the popular Kotlin Coroutines library, validating the framework's practical impact and expressiveness in a real-world environment. In sum, CancellableQueueSynchronizer is the first framework to combine expressiveness with formal guarantees and solid practical performance. Our approach should be extensible to other languages and families of synchronization primitives.
Bas Terwijn B1: Jorrit de Boer
P = BPP if E requires exponential circuits: derandomizing the XOR lemma Russell Impagliazzo, Avi Wigderson STOC 1997 Hardness versus randomness By a result of Impagliazzo and Wigderson (1997), at least one of the following two options is true: (i) EXP subset P/poly (ii) P=BPP The first statement says that every exponential time computable function can be computed using polynomial size circuits (for every separate input length). This is a very unlikely statement, that is strongly believed to be false, but alas, this is yet unproven. For example, it is believed that the famous NP-problem SAT requires superpolynomial circuits. The second statement says that every efficient randomized algorithm can be derandomized into an efficient deterministic algorithm. This is a very strong statement, that is partly backed by experience. Perhaps the most famous example is the problem of primality testing PRIMES, for which efficient randomized algorithms existed since the 1970, and that was eventually shown to be in P. However, there are still plenty of examples of randomized algorithms for which no good deterministic alternative exists. Still, given how unlikely (i) is, it is now believed that (ii) must be true. For this MFoCS project you can do the following: * Collect interesting examples of probabilistic algorithms for which no deterministic alternative is known. (E.g. polynomial identity testing (Schwarz-Zippel), estimating volumes using Monte Carlo methods.) * Understand part of the proof of the Impagliazzo-Wigderson result. There are several steps in the proof, that may be considered separately. The hardest part is the construction of a so-called pseudorandom generator (PRG). To derandomize a randomized algorithm, we need to deterministically generate random bits, which brings to mind the famous quote by John von Neumann: Anyone who attempts to generate random numbers by deterministic means is, of course, living in a state of sin. The PRG is a means to deterministically generate bits that look "random enough" to pull this off. Reference: L. Trevisan, Derandomization from hardness, 2019
Cynthia Kop C1: (not assigned)
Rewriting Induction + Linear Arithmetic = Decision Procedure Stephan Falke, Deepak Kapur IJCAR 2012 [slides] Rewriting induction is a proof method to show equivalence of equations using induction over a terminating term rewriting system. This paper describes the method of rewriting induction for term rewriting systems with integers, and gives a set of restrictions that guarantees that the proof method will always succeed in either proving equivalence or non-equivalence -- thus turning the method into a /decision procedure/. For the MFoCS project, there are several other papers on rewriting induction (with or without integers) that you could study as a second paper.
Kasper Hagens + Cynthia Kop C2: Adam Kjelstrøm
Polynomial Invariants by Linear Algebra Steven de Oliveira, Saddek Bensalem, Virgile Prevosto ATVA 2016 [slides] When dealing with computer programming, one often wants to guarantee that the behavior of some piece of code coincides with its intended behavior. Methods from formal verification allow us to prove this. However, such proofs often depend on the availability of invariants. The paper presents a technique for generating polynomial invariants for while loops. This is done two steps: - Linearization: represent a while loop as a linear transformation. - Invariant generation: compute the polynomial invariants, using some linear representation.
Dario Stein D1: Dick Blankvoort
Generalising monads to arrows John Hughes Science of Computer Programming 37(1-3), 2000 John Hughes' *arrows* are an alternative formalism to monads for organizing effectful computation in functional programming. For every monad, functions `a -> m b` form an arrow, but not all arrows come from monads, so the arrow typeclass gives some extra generality at the expense of expressive power. This project is fairly open ended -- you'll begin with an introduction to arrows and why they can be useful. You can then dive deeper into implementations, or discuss the more mathematical aspects of arrows (premonoidal categories). It is interesting to compare the arrow interface with Haskell's `proc` notation and a graphical syntax such as string diagrams. Reading: You can learn about arrows in various places. The original arrows paper is "Generalising monads to arrows" (Hughes 2000), for more mathematical aspects, see "Arrows, like monads, are monoids" (Heunen&Jacobs 2006). For graphical notation, have a look at "String Diagrams for Premonoidal Categories" by Mario Roman. Topics: functional programming, category theory, graphical reasoning
D2: Sergio Domínguez Cabrera
Kan Extensions for Program Optimisation Or: Art and Dan Explain an Old Trick Ralf Hinze MPC 2012 In functional programming, `[x] ++ xs` has low complexity while `xs ++ [x]` is expensive. John Hughes' "difference lists" represent the list `xs` as the function `prep xs = (-) ++ xs`. Concatenation of difference lists then becomes function composition, which takes care of calling ++ in a more efficient order. This is an instance of a continuation-passing-style (CPS) translation. A generalized version of this trick starts with any monad M, and computes a more efficient continuation representation. Your project is to give an exposition of the continuation trick and an elementary tour of the relevant categorical concepts (monads, adjunctions, Kan extensions). The original paper uses an elegant graphical syntax to reason about these things, which you can present alongside functional programs, or even explore further with the graphical proof assistant homotopy.io. Topics: category theory, graphical reasoning, functional programming
Niels van der Weide E1: Lean Ermantraut
Apartness, sharp elements, and the Scott topology of domain Tom de Jong MSCS 2023 Apartness is a fundamental notion in constructive mathematics. When we are working with objects that are inherently, such as real numbers or streams, then proving their equality requires infinite evidence. Apartness formulates inequality in a positive way, and proving apartness of a stream or a real number, only requires finite evidence. This notion is better behaved than equality constructively. This paper studies apartness from a constructive perspective using domain theory. Domain theory is a field of mathematics that has applications to theoretical computer science, programming languages, and topology, and the fundamental notion in domain theory is that of a dcpo It defines an intrinsic notion of apartness for dcpos, and it shows that this apartness relation is well-behaved if one restrict themselves to so-called sharp elements. In addition, the notion of strong maximality is defined, and it is shown that every strongly maximal element is sharp. The papers finishes with a number of examples, among which are the the Baire space (i.e., streams of natural numbers) and the Dedekind real numbers, and each of the aforementioned notion gets illustrated by them.
Freek Wiedijk F1: (not assigned)
Modular Verification of State-Based CRDTs in Separation Logic Abel Nieto, Arnaud Daby-Seesaram, Léon Gondelman, Amin Timany, Lars Birkedal ECOOP 2023 CRDTs are Conflict-free Replicated Data Types. These are datastructures for the case that people operate on data at different sites without continuous connectivity between those sites. The CAP theorem says that it is not possible to have simultaneously the properties of Consistency, Availability and Partition tolerance. Instead the property of strong eventual consistency is possible. These means that when the sites again get connected again, all sites eventually get into the same state without a need for manual conflict resolution, and furthermore ("strong") that the final state does not depend on the order in which the updates are communicated between the sites. Examples of applications of CRDTs are collaborative editors like Google Docs, and the shopping carts of online vendors. The paper (as well as an earlier paper in OOPSLA 2022) describes a formalization of the theory of CRTDs in Coq using the Iris framework. A presentation that just presents this theory clearly and mathematically already would be very nice. But looking at CRDTs for text editing (which can be quite complex) also is interesting.
F2: (not assigned)
Proving the Completeness Theorem within Isabelle/HOL James Margetson, Tom Ridge AFP 2004 First order predicate logic has two properties: it is sound and complete. Soundness means that provable statements hold in all models, and completeness states that statements that hold in all models are provable. This was the subject of Gödel's 1929 PhD thesis. This publication is not a refereed paper, but instead is a companion document for an Isabelle/HOL formalization of a proof of the completeness theorem. It might be interesting to compare this formalization with other completeness formalizations (possibly in other proof assistants). And it might be interesting to look into constructive versions of the completeness theorem.
F3: (not assigned)
ProofNet: Autoformalizing and Formally Proving Undergraduate-Level Mathematics Zhangir Azerbayev, Bartosz Piotrowski, Hailey Schoelkopf, Edward W. Ayers, Dragomir Radev, Jeremy Avigad arXiv 2023 Autoformalization is the use of machine learning and theorem proving to automatically turn existing mathematical texts in formalizations. The source for this might be latex (like for instance from arXiv), or scans of books or handwritten notes. The target then would be formalizations in systems like Coq, Isabelle or Lean. This is a very recent paper ("Under review"), which describes a benchmark for autoformalization in the context of the Lean system. A presentation that surveys the current state of the art of this field, as well as where things might be going, would be very interesting.
Herman Geuvers H1: Ioannis Koutsidis
Complete Bidirectional Typing for the Calculus of Inductive Constructions Meven Lennon-Bertrand ITP 2021 The article presents a so caled "bidirectional type system" for the Calculus of Inductive Constructions (CIC), which is the type theory of the Coq system. Type theory is about the typing judgement Γ ⊢ t : T, stating that term t is of type T in context Γ. The core of the Coq system is a "type checking algorithm" that verifies whether the judgment Γ ⊢ t : T holds. When developing atype checking algorithm, one immediately notices that one needs (simultaneously) a "type inference algorithm", which, given Γ and t, computes: either a T such that Γ ⊢ t : T holds, or `fail' to signify that no such T exists. So, a typung algorithm has two "modes", a "type checking" mode and a "type inference" mode. The bidirectional approach makes these two modes explicit in the typing rules, by decomposing the undirected typing Γ ⊢ t : T into two separate judgments Γ ⊢ t ▷ T (inference) and Γ ⊢ t ◁ T (checking). This decomposition allows a direct connection between the typing rules and the actual typing algorithms, but also gives a finer grained structure to typing derivations, which has independent interest. A key property of the bidirectional system is its completeness with respect to the usual undirected one. The bidirectional approach has become very popular and has been applied to various type theories. The present work applies it tothe type theory of Coq and proves its completeness, and all this has been formalised in Coq as a part of the MetaCoq project.
H2: (not assigned)
Normal Derivability in Classical Natural Deduction Jan von Plato, Annika Siders Review of Symbolic Logic 5(2), 2012 Following the Curry-Howard formulas-as-types correspondence, there is a one-one correspondence between natural deduction derivations (in logic) and typed lambda-terms. Moreover the reduction on these terms (e.g. beta-reduction) corresponds with so called "detour-elimination" on natural deduction. A central theorem states that this reduction terminates, which is an important result, as it implies that a derivable formula can be derived using a "normal derivation", which implies decidability of the logic. For intuitionistic logic, termination of detour elimination has been studied in detail, from various viewpoints (that is: using various definitions of the logical rules). So have the corresponding term reductions, which give rise to standard data-type operations. For classical logic, the situation has been less canonical for a long time: normalizability was only proven for fragments, or the proofs were really contrived. The paper by von Plato-Siders gives a concrete method to transform a derivation in classical predicate logic into normal-form. The 'trick' is that theu use so called "general elimination rules", which are not really described in the paper. (The paper is very concise anyway.) So it might be needed to also read at least sections 1,2 (and maybe also sections 5,6) of this paper: Natural deduction with general elimination rules -- Jan von Plato Archive for Mathematical Logic 40, 541–567 (2001)
Jurriaan Rot J1: Rutger Dinnissen
Completeness Theorems for Kleene Algebra with Top Damien Pous, Jana Wagemaker CONCUR 2022 Kleene algebra is a complete axiomatisation of language equivalence of regular expressions. One of its appealing aspects is that it is also complete w.r.t. relational models, giving a fascinating connection between relation algebra and automata theory. In the paper, the authors add a "top" constant T to the expressions. In the model of regular languages, the symbol T has a natural interpretation as the full language, and it's easy to axiomatise, with the single equation e <= T. However, in the relational model, the situation is much more difficult: the axiom e <= T not enough if we want T to be the total relation. In the paper it is shown that it suffices to add the equation e <= e T e to get completeness for the relational case. The proof goes via graph homomorphisms, and is really elegant. The project consists in understanding this proof and the background on Kleene algebra. Subsequently there are plenty of directions to expand (with a second paper), in relation algebra and/or Kleene algebra.
Loes Kruger + Jurriaan Rot J2: Cas Haaijman
Compositional Learning for Interleaving Parallel Automata Faezeh Labbaf, Jan Friso Groote, Hossein Hojjat, Mohammad Reza Mousavi FoSSaCS 2023 Automata learning is a family of techniques for automatically constructing automata by running tests, such as checking whether or not a word is in the target language. It has been applied, for instance, to learn models of - and find bugs in - implementations of network protocols. In this paper the authors assume structure on the target automaton, namely that it consists of several concurrent components. The idea is to learn these components, instead of the entire composed system (which is big). For this MFoCS project, you will learn about automata learning and this recent extension.
Marnix Suilen + Nils Jansen N1: (not assigned)
Predictive Representations of State Michael Littman, Richard Sutton, Satinder Singh NIPS 2001 Learning Predictive State Representations Satinder Singh, Michael Littman, Nicholas Jong, David Pardoe, Peter Stone ICML 2003 Partially observable Markov decision processes (POMDPs) are the standard model for decision-making under uncertainty. The decision maker (agent) is faced with uncertainty about the outcomes of their decisions and also only has partial knowledge of the state of the environment, making the problem of finding an optimal decision-making strategy especially hard. Indeed, most problems for POMDPs are undecidable, and the ones that are decidable range from PSPACE-complete to EXPTIME-complete. Standard approaches to finding good decision-making strategies in POMDPs often consider the observation history. That is, a decision-making strategy looks back at what happened before in an attempt to gain knowledge about the current state of the environment. It is known, however, that no finite amount of history can ever be sufficient to fully identify the underlying state information of POMDPs in general. Predictive state representations (PSRs) were introduced for POMDPs as an alternative to histories. Instead of looking back at what happened before, a PSR looks ahead into the future and tries to predict what will happen. Formally, a PSR is a vector of predictions (probabilities) for a number of tests (finite sequences of actions and observations) that forms a sufficient statistic for the environment. In this project, we consider two papers as a starting point. Paper (1) connects POMDPs and PSR in a fundamental way by proving that every finite POMDP model has a linear PSR of size bounded by the number of POMDP states. When the POMDP model is fully known, a PSR can be constructed. However, in many realistic applications, POMDP models are unknown and need to be inferred from data. Paper (2) does precisely that by introducing an algorithm for learning a PSR from observed data from an unknown POMDP.
Eline Bovy + Nils Jansen N2: Willem Lambooij
Solving zero-sum one-sided partially observable stochastic games Karel Horák, Branislav Bošanský, Vojtěch Kovařik, Christopher Kiekintveld Artificial Intelligence 316, 2023 Partially Observable Stochastic Games (POSGs) are models for sequential decision-making problems with one or more players where the players have partial information about the state of the problem. POSGs extend the Partially Observable Markov Decision Processes (POMDPs) to the multi-agent setting. The goal in a POSG is to find a Nash equilibrium, i.e., decision-making strategies for each player such that no player can benefit by changing their strategy. This paper focuses on a subset of POSGs: two-player zero-sum one-sided POSGs. Zero-sum means that the players have directly opposing goals. A reward for one player leads to an equivalent negative reward for the other. One-sided means that only one of the players has partial observability while the other has full observability. The paper adapts the Heuristic Search Value Iteration algorithm (HSVI) used for POMDPs to work for their POSG subset. This algorithm approximates the expected discounted reward when both players play according to the Nash equilibrium. Furthermore, the paper provides algorithms to derive the approximate Nash equilibrium, so the decision-making strategies that lead to this expected discounted reward. Finally, the paper evaluates their new algorithms on various multi-agent sequential decision-making problems.
Robbert Krebbers R1: (not assigned)
Deriving Proved Equality Tests in Coq-Elpi: Stronger Induction Principles for Containers in Coq Enrico Tassi ITP 2019 When defining algebraic data types in programming languages such as Rust and Haskell one can let the language automatically generate boilerplate definitions. An example is the equality `T_eq : T → T → Bool`. Generating these definitions in proof assistants (such as Coq) instead of ordinary programming languages (such as Rust or Haskell) is complicated. One should also generate correctness proofs, e.g., `T_eq x y = true <-> x = y`. This paper addresses the challenge of dealing with containers (e.g., using `list` in the definition of n-ary trees) and providing modular proofs in the presence of Coq's syntactic termination checker. The result involves a technique called "unary parametricity" and has been implemented using the meta programming language "Coq-Elpi". Coq-Elpi is a new language based on "λProlog" that can be seen as a successor of Ltac. This paper will excite you if: - You want to understand how inductive types and termination checking works in type theory / Coq. - You want to know what "parametricity" is about - You want to learn about meta programming in in type theory / Coq. By reading a second paper you can, for example, study other uses of meta programming in Coq, study the theory of λProlog, investigate other challenges (types with many constructors, dependent types).
Ike Mulder + Robbert Krebbers R2: Menzo van Kessel
Subformula Linking for Intuitionistic Logic with Application to Type Theory Kaustuv Chaudhuri CADE 2021 Traditional proof systems for (intuitionistic) first-order logic have rules to manipulate a single connective in a single formula. This means that sometimes, it take quite a bit of effort to match hypotheses with your proof obligation. For example, suppose we are proving. ((A → A) → B ∧ C) ⊢ C ∧ B When proving this, you want to match up the atoms B and C from your hypotheses, with these same atoms in the goal. But to do so, you must first eliminate the implication (and prove (A → A)), then eliminate the conjunction, before you can continue. In a sense, these steps are quite 'mechanical'. This paper introduces 'subformula linking' for intuitionistic logic. Rather than doing all these steps yourself, you simply 'link' the occurrences of B in above example. The paper introduces rules that determine an appropriate remaining goal. There is a prototype tool, called ProfInt, so that you can actually do this in the browser. If you want to play around with it, see here: https://chaudhuri.info/research/profint/ This paper should be interesting to you if you are interested in formal logic and automated or interactive theorem proving. For the second paper, you could look for example into earlier work on subformula linking in linear logic, or look into other work on 'gestural' theorem proving.
Marc Hermes + Robbert Krebbers R3: Jakob Wuhrer
Certifying Graph-Manipulating C Programs via Localizations within Data Structures Shengyi Wang, Qinxiang Cao, Anshuman Mohan, Aquinas Hobor OOPSLA 2019 [slides] Data structures frequently involve underlying graph elements, particularly when implemented using pointers in the heap. Such implementations present unique hurdles for formal verification; They may permit cycles and shared nodes, complicating the establishment of invariants, and their operations often require complex pointer manipulations. The featured paper proposes a method grounded in a localization rule, enabling modular reasoning by focusing in on specific heap segments relevant to the program's execution. The papers listed in the accompanying slides present diverse approaches for verifying graph-based data structures, making them suitable for comparative analysis. "Proof Pearl: Magic Wand as Frame" stands out as an great primer on the subject.
Wieb Bosma W1: Sipho Kemkes
On the computational complexity of algebraic numbers: the Hartmanis-Stearns problem revisited Boris Adamczewski, Julien Cassaigne, Marion le Gonidec Transactions of the AMS 373(5), 2020 https://adamczewski.perso.math.cnrs.fr/HS.pdf The paper is concerned with the question from computational complexity theory asking how difficult it is to compute the base b digits of an algebraic real number. In this nicely written, recent paper three negative results are proven, two of which are novel. They state that it is impossible to generate the base b expansion * by a finite automaton (or uniform morphism), or * by a morphism with exponential growth, or * by a deterministic pusdown automata. As a consequence, real numbers generated by either of these devices are necessarily transcendental (if not rational). It may not be necessary to master all the details of this long paper, but you should certainly acquire the taste of results and proofs. For a second paper it may be useful to go back to either the paper of Hartmanis and Stearns, or probably better, to the paper by Cobham that appeared a few years later and in which he proved the first result (apparently) and also stated the second result.