Anton Golov
A1: (not assigned)
CQS: A Formally-Verified Framework for Fair and Abortable Synchronization
Nikita Koval, Dmitry Khalanskiy, Dan Alistarh
PLDI 2023
Writing concurrent code that is both correct and efficient is notoriously
difficult. Thus, programmers often prefer to use synchronization abstractions,
which render code simpler and easier to reason about. Despite a wealth of work
on this topic, there is still a gap between the rich semantics provided by
synchronization abstractions in modern programming languages—specifically,
fair FIFO ordering of synchronization requests and support for
abortable operations—and frameworks for implementing it correctly and
efficiently.
Supporting such semantics is critical given the rising popularity of
constructs for asynchronous programming, such as coroutines, which abort
frequently and are cheaper to suspend and resume compared to native threads.
This paper introduces a new framework called
CancellableQueueSynchronizer
(CQS),
which enables simple yet efficient implementations of a wide range of fair and
abortable synchronization primitives: mutexes, semaphores, barriers,
count-down latches, and blocking pools.
Our main contribution is algorithmic, as implementing both fairness and
abortability efficiently at this level of generality is non-trivial.
Importantly, all our algorithms, including the CQS framework and the
primitives built on top of it, come with formal proofs in the Iris
framework for Coq for many of their properties. These proofs are modular, so
it is easy to show correctness for new primitives implemented on top of CQS.
From a practical perspective, implementation of CQS for native threads on the
JVM improves throughput by up to two orders of magnitude over Java's
AbstractQueuedSynchronizer
, the only practical abstraction
offering similar semantics.
Further, we successfully integrated CQS as a core component of the popular
Kotlin Coroutines library, validating the framework's practical impact and
expressiveness in a real-world environment.
In sum, CancellableQueueSynchronizer
is the first framework to
combine expressiveness with formal guarantees and solid practical performance.
Our approach should be extensible to other languages and families of
synchronization primitives.
Bas Terwijn
B1: Jorrit de Boer
P = BPP if E requires exponential circuits: derandomizing the XOR lemma
Russell Impagliazzo, Avi Wigderson
STOC 1997
Hardness versus randomness
By a result of Impagliazzo and Wigderson (1997), at least one of the
following two options is true:
(i) EXP subset P/poly
(ii) P=BPP
The first statement says that every exponential time computable function
can be computed using polynomial size circuits (for every separate input
length). This is a very unlikely statement, that is strongly believed to
be false, but alas, this is yet unproven. For example, it is believed that
the famous NP-problem SAT requires superpolynomial circuits.
The second statement says that every efficient randomized algorithm can
be derandomized into an efficient deterministic algorithm. This is a very
strong statement, that is partly backed by experience. Perhaps the most
famous example is the problem of primality testing PRIMES, for which
efficient randomized algorithms existed since the 1970, and that was
eventually shown to be in P. However, there are still plenty of examples
of randomized algorithms for which no good deterministic alternative
exists. Still, given how unlikely (i) is, it is now believed that (ii)
must be true.
For this MFoCS project you can do the following:
* Collect interesting examples of probabilistic algorithms for which no
deterministic alternative is known. (E.g. polynomial identity testing
(Schwarz-Zippel), estimating volumes using Monte Carlo methods.)
* Understand part of the proof of the Impagliazzo-Wigderson result.
There are several steps in the proof, that may be considered separately.
The hardest part is the construction of a so-called pseudorandom
generator (PRG). To derandomize a randomized algorithm, we need to
deterministically generate random bits, which brings to mind the famous
quote by John von Neumann:
Anyone who attempts to generate random numbers by deterministic means
is, of course, living in a state of sin.
The PRG is a means to deterministically generate bits that look "random
enough" to pull this off.
Reference:
L. Trevisan, Derandomization from hardness, 2019
Cynthia Kop
C1: (not assigned)
Rewriting Induction + Linear Arithmetic = Decision Procedure
Stephan Falke, Deepak Kapur
IJCAR 2012
[slides]
Rewriting induction is a proof method to show equivalence of equations using
induction over a terminating term rewriting system. This paper describes the
method of rewriting induction for term rewriting systems with integers, and
gives a set of restrictions that guarantees that the proof method will always
succeed in either proving equivalence or non-equivalence -- thus turning the
method into a /decision procedure/.
For the MFoCS project, there are several other papers on rewriting induction
(with or without integers) that you could study as a second paper.
Kasper Hagens + Cynthia Kop
C2: Adam Kjelstrøm
Polynomial Invariants by Linear Algebra
Steven de Oliveira, Saddek Bensalem, Virgile Prevosto
ATVA 2016
[slides]
When dealing with computer programming, one often wants to
guarantee that the behavior of some piece of code coincides
with its intended behavior. Methods from formal verification
allow us to prove this. However, such proofs often depend
on the availability of invariants. The paper presents a
technique for generating polynomial invariants for while
loops. This is done two steps:
- Linearization: represent a while loop as a linear
transformation.
- Invariant generation: compute the polynomial invariants,
using some linear representation.
Dario Stein
D1: Dick Blankvoort
Generalising monads to arrows
John Hughes
Science of Computer Programming 37(1-3), 2000
John Hughes' *arrows* are an alternative formalism to monads for organizing
effectful computation in functional programming. For every monad, functions `a
-> m b` form an arrow, but not all arrows come from monads, so the arrow
typeclass gives some extra generality at the expense of expressive power.
This project is fairly open ended -- you'll begin with an introduction to
arrows and why they can be useful. You can then dive deeper into
implementations, or discuss the more mathematical aspects of arrows
(premonoidal categories). It is interesting to compare the arrow interface
with Haskell's `proc` notation and a graphical syntax such as string diagrams.
Reading: You can learn about arrows in various places. The original arrows
paper is "Generalising monads to arrows" (Hughes 2000), for more mathematical
aspects, see "Arrows, like monads, are monoids" (Heunen&Jacobs 2006). For
graphical notation, have a look at "String Diagrams for Premonoidal
Categories" by Mario Roman.
Topics: functional programming, category theory, graphical reasoning
D2: Sergio Domínguez Cabrera
Kan Extensions for Program Optimisation Or: Art and Dan Explain an
Old Trick
Ralf Hinze
MPC 2012
In functional programming, `[x] ++ xs` has low complexity while `xs ++ [x]` is
expensive. John Hughes' "difference lists" represent the list `xs` as the
function `prep xs = (-) ++ xs`. Concatenation of difference lists then becomes
function composition, which takes care of calling ++ in a more efficient
order. This is an instance of a continuation-passing-style (CPS) translation.
A generalized version of this trick starts with any monad M, and computes a
more efficient continuation representation.
Your project is to give an exposition of the continuation trick and an
elementary tour of the relevant categorical concepts (monads, adjunctions, Kan
extensions). The original paper uses an elegant graphical syntax to reason
about these things, which you can present alongside functional programs, or
even explore further with the graphical proof assistant homotopy.io.
Topics: category theory, graphical reasoning, functional programming
Niels van der Weide
E1: Lean Ermantraut
Apartness, sharp elements, and the Scott topology of domain
Tom de Jong
MSCS 2023
Apartness is a fundamental notion in constructive
mathematics. When we are working with objects that are
inherently, such as real numbers or streams, then proving
their equality requires infinite evidence. Apartness
formulates inequality in a positive way, and proving
apartness of a stream or a real number, only requires
finite evidence. This notion is better behaved than equality
constructively.
This paper studies apartness from a constructive perspective
using domain theory. Domain theory is a field of mathematics
that has applications to theoretical computer science,
programming languages, and topology, and the fundamental
notion in domain theory is that of a dcpo It defines an
intrinsic notion of apartness for dcpos, and it shows that
this apartness relation is well-behaved if one restrict
themselves to so-called sharp elements. In addition, the
notion of strong maximality is defined, and it is shown
that every strongly maximal element is sharp. The papers
finishes with a number of examples, among which are the
the Baire space (i.e., streams of natural numbers) and the
Dedekind real numbers, and each of the aforementioned notion
gets illustrated by them.
Freek Wiedijk
F1: (not assigned)
Modular Verification of State-Based CRDTs in Separation Logic
Abel Nieto, Arnaud Daby-Seesaram, Léon Gondelman, Amin Timany, Lars Birkedal
ECOOP 2023
CRDTs are Conflict-free Replicated Data Types. These are
datastructures for the case that people operate on data
at different sites without continuous connectivity between
those sites. The CAP theorem says that it is not possible
to have simultaneously the properties of Consistency,
Availability and Partition tolerance. Instead the
property of strong eventual consistency is possible.
These means that when the sites again get connected again,
all sites eventually get into the same state without a need
for manual conflict resolution, and furthermore ("strong")
that the final state does not depend on the order in
which the updates are communicated between the sites.
Examples of applications of CRDTs are collaborative editors
like Google Docs, and the shopping carts of online vendors.
The paper (as well as an earlier paper in OOPSLA 2022)
describes a formalization of the theory of CRTDs in
Coq using the Iris framework. A presentation that just
presents this theory clearly and mathematically already
would be very nice. But looking at CRDTs for text editing
(which can be quite complex) also is interesting.
F2: (not assigned)
Proving the Completeness Theorem within Isabelle/HOL
James Margetson, Tom Ridge
AFP 2004
First order predicate logic has two properties: it
is sound and complete. Soundness means that provable
statements hold in all models, and completeness states
that statements that hold in all models are provable.
This was the subject of Gödel's 1929 PhD thesis.
This publication is not a refereed paper, but instead is
a companion document for an Isabelle/HOL formalization
of a proof of the completeness theorem.
It might be interesting to compare this formalization
with other completeness formalizations (possibly in other
proof assistants). And it might be interesting to look
into constructive versions of the completeness theorem.
F3: (not assigned)
ProofNet: Autoformalizing and Formally Proving
Undergraduate-Level Mathematics
Zhangir Azerbayev, Bartosz Piotrowski, Hailey Schoelkopf,
Edward W. Ayers, Dragomir Radev, Jeremy Avigad
arXiv 2023
Autoformalization is the use of machine learning and
theorem proving to automatically turn existing mathematical
texts in formalizations. The source for this might
be latex (like for instance from arXiv), or scans of
books or handwritten notes. The target then would be
formalizations in systems like Coq, Isabelle or Lean.
This is a very recent paper ("Under review"), which
describes a benchmark for autoformalization in the context
of the Lean system.
A presentation that surveys the current state of the art
of this field, as well as where things might be going,
would be very interesting.
Herman Geuvers
H1: Ioannis Koutsidis
Complete Bidirectional Typing for the Calculus of Inductive
Constructions
Meven Lennon-Bertrand
ITP 2021
The article presents a so caled "bidirectional type system" for the
Calculus of Inductive Constructions (CIC), which is the type theory of
the Coq system.
Type theory is about the typing judgement Γ ⊢ t : T, stating that term
t is of type T in context Γ. The core of the Coq system is a "type
checking algorithm" that verifies whether the judgment Γ ⊢ t : T
holds. When developing atype checking algorithm, one immediately
notices that one needs (simultaneously) a "type inference algorithm",
which, given Γ and t, computes: either a T such that Γ ⊢ t : T holds,
or `fail' to signify that no such T exists. So, a typung algorithm has
two "modes", a "type checking" mode and a "type inference" mode.
The bidirectional approach makes these two modes explicit in the typing
rules, by decomposing the undirected typing Γ ⊢ t : T into two
separate judgments Γ ⊢ t ▷ T (inference) and Γ ⊢ t ◁ T (checking).
This decomposition allows a direct connection between the typing rules
and the actual typing algorithms, but also gives a finer grained
structure to typing derivations, which has independent interest.
A key property of the bidirectional system is its completeness with
respect to the usual undirected one.
The bidirectional approach has become very popular and has been
applied to various type theories. The present work applies it tothe
type theory of Coq and proves its completeness, and all this has been
formalised in Coq as a part of the MetaCoq project.
H2: (not assigned)
Normal Derivability in Classical Natural Deduction
Jan von Plato, Annika Siders
Review of Symbolic Logic 5(2), 2012
Following the Curry-Howard formulas-as-types correspondence, there is
a one-one correspondence between natural deduction derivations (in
logic) and typed lambda-terms. Moreover the reduction on these terms
(e.g. beta-reduction) corresponds with so called "detour-elimination"
on natural deduction. A central theorem states that this reduction
terminates, which is an important result, as it implies that a
derivable formula can be derived using a "normal derivation", which
implies decidability of the logic. For intuitionistic logic,
termination of detour elimination has been studied in detail, from
various viewpoints (that is: using various definitions of the logical
rules). So have the corresponding term reductions, which give rise to
standard data-type operations. For classical logic, the situation has
been less canonical for a long time: normalizability was only proven
for fragments, or the proofs were really contrived.
The paper by von Plato-Siders gives a concrete method to transform a
derivation in classical predicate logic into normal-form. The 'trick'
is that theu use so called "general elimination rules", which are not
really described in the paper. (The paper is very concise anyway.) So
it might be needed to also read at least sections 1,2 (and maybe also
sections 5,6) of this paper:
Natural deduction with general elimination rules
-- Jan von Plato
Archive for Mathematical Logic 40, 541–567 (2001)
Jurriaan Rot
J1: Rutger Dinnissen
Completeness Theorems for Kleene Algebra with Top
Damien Pous, Jana Wagemaker
CONCUR 2022
Kleene algebra is a complete axiomatisation of language
equivalence of regular expressions. One of its appealing
aspects is that it is also complete w.r.t. relational models,
giving a fascinating connection between relation algebra
and automata theory. In the paper, the authors add a "top"
constant T to the expressions. In the model of regular
languages, the symbol T has a natural interpretation as
the full language, and it's easy to axiomatise, with the
single equation e <= T. However, in the relational model,
the situation is much more difficult: the axiom e <= T not
enough if we want T to be the total relation. In the paper
it is shown that it suffices to add the equation e <= e T
e to get completeness for the relational case. The proof
goes via graph homomorphisms, and is really elegant.
The project consists in understanding this proof and the
background on Kleene algebra. Subsequently there are plenty
of directions to expand (with a second paper), in relation
algebra and/or Kleene algebra.
Loes Kruger + Jurriaan Rot
J2: Cas Haaijman
Compositional Learning for Interleaving Parallel Automata
Faezeh Labbaf, Jan Friso Groote, Hossein Hojjat, Mohammad Reza Mousavi
FoSSaCS 2023
Automata learning is a family of techniques for automatically
constructing automata by running tests, such as checking
whether or not a word is in the target language. It has been
applied, for instance, to learn models of - and find bugs in
- implementations of network protocols. In this paper the
authors assume structure on the target automaton, namely
that it consists of several concurrent components. The
idea is to learn these components, instead of the entire
composed system (which is big). For this MFoCS project, you
will learn about automata learning and this recent extension.
Marnix Suilen + Nils Jansen
N1: (not assigned)
Predictive Representations of State
Michael Littman, Richard Sutton, Satinder Singh
NIPS 2001
Learning Predictive State Representations
Satinder Singh, Michael Littman, Nicholas Jong, David Pardoe, Peter Stone
ICML 2003
Partially observable Markov decision processes (POMDPs) are
the standard model for decision-making under uncertainty.
The decision maker (agent) is faced with uncertainty about
the outcomes of their decisions and also only has partial
knowledge of the state of the environment, making the
problem of finding an optimal decision-making strategy
especially hard. Indeed, most problems for POMDPs are
undecidable, and the ones that are decidable range from
PSPACE-complete to EXPTIME-complete.
Standard approaches to finding good decision-making
strategies in POMDPs often consider the observation
history. That is, a decision-making strategy looks
back at what happened before in an attempt to gain
knowledge about the current state of the environment.
It is known, however, that no finite amount of history
can ever be sufficient to fully identify the underlying
state information of POMDPs in general.
Predictive state representations (PSRs) were introduced for
POMDPs as an alternative to histories. Instead of looking
back at what happened before, a PSR looks ahead into the
future and tries to predict what will happen. Formally, a
PSR is a vector of predictions (probabilities) for a number
of tests (finite sequences of actions and observations)
that forms a sufficient statistic for the environment.
In this project, we consider two papers as a starting
point.
Paper (1) connects POMDPs and PSR in a fundamental way by
proving that every finite POMDP model has a linear PSR of
size bounded by the number of POMDP states. When the POMDP
model is fully known, a PSR can be constructed. However,
in many realistic applications, POMDP models are unknown
and need to be inferred from data. Paper (2) does precisely
that by introducing an algorithm for learning a PSR from
observed data from an unknown POMDP.
Eline Bovy + Nils Jansen
N2: Willem Lambooij
Solving zero-sum one-sided partially observable stochastic games
Karel Horák, Branislav Bošanský, Vojtěch Kovařik, Christopher Kiekintveld
Artificial Intelligence 316, 2023
Partially Observable Stochastic Games (POSGs) are models
for sequential decision-making problems with one or more
players where the players have partial information about the
state of the problem. POSGs extend the Partially Observable
Markov Decision Processes (POMDPs) to the multi-agent
setting. The goal in a POSG is to find a Nash equilibrium,
i.e., decision-making strategies for each player such that
no player can benefit by changing their strategy.
This paper focuses on a subset of POSGs: two-player
zero-sum one-sided POSGs. Zero-sum means that the players
have directly opposing goals. A reward for one player leads
to an equivalent negative reward for the other. One-sided
means that only one of the players has partial observability
while the other has full observability.
The paper adapts the Heuristic Search Value Iteration
algorithm (HSVI) used for POMDPs to work for their
POSG subset. This algorithm approximates the expected
discounted reward when both players play according to
the Nash equilibrium. Furthermore, the paper provides
algorithms to derive the approximate Nash equilibrium, so
the decision-making strategies that lead to this expected
discounted reward. Finally, the paper evaluates their new
algorithms on various multi-agent sequential decision-making
problems.
Robbert Krebbers
R1: (not assigned)
Deriving Proved Equality Tests in Coq-Elpi: Stronger
Induction Principles for Containers in Coq
Enrico Tassi
ITP 2019
When defining algebraic data types in programming languages such as Rust and
Haskell one can let the language automatically generate boilerplate
definitions. An example is the equality `T_eq : T → T → Bool`. Generating
these definitions in proof assistants (such as Coq) instead of ordinary
programming languages (such as Rust or Haskell) is complicated. One should
also generate correctness proofs, e.g., `T_eq x y = true <-> x = y`.
This paper addresses the challenge of dealing with containers (e.g., using
`list` in the definition of n-ary trees) and providing modular proofs in the
presence of Coq's syntactic termination checker. The result involves a
technique called "unary parametricity" and has been implemented using the meta
programming language "Coq-Elpi". Coq-Elpi is a new language based on "λProlog"
that can be seen as a successor of Ltac.
This paper will excite you if:
- You want to understand how inductive types and termination checking works in
type theory / Coq.
- You want to know what "parametricity" is about
- You want to learn about meta programming in in type theory / Coq.
By reading a second paper you can, for example, study other uses of meta
programming in Coq, study the theory of λProlog, investigate other challenges
(types with many constructors, dependent types).
Ike Mulder + Robbert Krebbers
R2: Menzo van Kessel
Subformula Linking for Intuitionistic Logic with Application to
Type Theory
Kaustuv Chaudhuri
CADE 2021
Traditional proof systems for (intuitionistic) first-order logic
have rules to manipulate a single connective in a single formula.
This means that sometimes, it take quite a bit of effort to match hypotheses
with your proof obligation.
For example, suppose we are proving.
((A → A) → B ∧ C) ⊢ C ∧ B
When proving this, you want to match up the atoms B and C from your
hypotheses,
with these same atoms in the goal. But to do so, you must first eliminate
the implication (and prove (A → A)), then eliminate the conjunction, before
you can continue. In a sense, these steps are quite 'mechanical'.
This paper introduces 'subformula linking' for intuitionistic logic.
Rather than doing all these steps yourself, you simply 'link' the
occurrences of B in above example. The paper introduces rules that
determine an appropriate remaining goal. There is a prototype tool,
called ProfInt, so that you can actually do this in the browser.
If you want to play around with it, see here:
https://chaudhuri.info/research/profint/
This paper should be interesting to you if you are interested in formal logic
and automated or interactive theorem proving. For the second paper,
you could look for example into earlier work on subformula linking in linear
logic,
or look into other work on 'gestural' theorem proving.
Marc Hermes + Robbert Krebbers
R3: Jakob Wuhrer
Certifying Graph-Manipulating C Programs via Localizations
within Data Structures
Shengyi Wang, Qinxiang Cao, Anshuman Mohan, Aquinas Hobor
OOPSLA 2019
[slides]
Data structures frequently involve underlying graph elements,
particularly when implemented using pointers in the heap. Such
implementations present unique hurdles for formal verification;
They may permit cycles and shared nodes, complicating the
establishment of invariants, and their operations often require
complex pointer manipulations.
The featured paper proposes a method grounded in a localization
rule, enabling modular reasoning by focusing in on specific heap
segments relevant to the program's execution.
The papers listed in the accompanying slides present diverse
approaches for verifying graph-based data structures, making them
suitable for comparative analysis. "Proof Pearl: Magic Wand as Frame"
stands out as an great primer on the subject.
Wieb Bosma
W1: Sipho Kemkes
On the computational complexity of algebraic numbers:
the Hartmanis-Stearns problem revisited
Boris Adamczewski, Julien Cassaigne, Marion le Gonidec
Transactions of the AMS 373(5), 2020
https://adamczewski.perso.math.cnrs.fr/HS.pdf
The paper is concerned with the question from computational
complexity theory asking how difficult it is to compute the base
b digits of an algebraic real number. In this nicely written, recent
paper three negative results are proven, two of which are novel.
They state that it is impossible to generate the base b expansion
* by a finite automaton (or uniform morphism), or
* by a morphism with exponential growth, or
* by a deterministic pusdown automata.
As a consequence, real numbers generated by either of these devices
are necessarily transcendental (if not rational).
It may not be necessary to master all the details of this long paper,
but you should certainly acquire the taste of results and proofs.
For a second paper it may be useful to go back to either the paper
of Hartmanis and Stearns, or probably better, to the paper by
Cobham that appeared a few years later and in which he proved
the first result (apparently) and also stated the second result.