State machine inference for security protocols

This webpage tries to give a overview of experiments with using state machine inference (aka active automata learning) to analyse implementations of standard security protocols. If you know of experiments that we are missing in this overview, please send us an email!

Background

Implementing a security protocol usually involves implementing a finite state machine (aka a Deterministic Finite Automata or DFA) to correctly handle different protocol flows. If the happy flow is not implemented correctly then you quickly notice, as the implementation simply will not work, but logical flaws in handling non-standard protocol flows can be harder to spot. Such flaws can introduce security vulnerabilities, as has for instance been the case for TLS (see here and here).

State machine inference is a technique to automatically infer the state machine from an implementation by means of black-box testing. This means we obtain formal models for free! Inspecting the state machines obtained - either manually or using a model-checker - can reveal flaws in implementations. It may also reveal differences between implementations of the same protocol that allow fingerprinting of particular implementations. Such differences are also an indication that the specifications may be unclear or ambiguous.

Often protocol state machines are not well-specified or left largely implicit in specifications. Paying explicit attention to these state machines can be seen as adhering to the principles of LangSec (language-theoretic security), as discussed in more detail here. Active state machine learning can also be regarded as a form of fuzzing, where you do not fuzz individual messages, but rather sequences of messages.

Overview of protocols analysed

Below an overview of research into the use of using active learning to analyse implementations of standardised security protocols. (So we exclude e.g. botnet protocols in this overview.) This wiki provides additional information about some of the experiments listed below. This wiki also includes information n some case studies with protocols that are not security-related, such as TCP, MQTT, and SIP.

Network security protocols

TLS
- Joeri de Ruiter, A Tale of the OpenSSL State Machine: a Large-scale Black-box Analysis, NordSec 2016, LNCS volume 10014, pp. 169-184, Springer, 2016.
  [Models, Scripts]
- Joeri de Ruiter and Erik Poll, Protocol state fuzzing of TLS implementations, USENIX Security, USENIX, 2015.
  [Code, Models]
SSH
- Paul Fiterau-Brostean, Frits Vaandrager, Erik Poll, Joeri de Ruiter, Toon Lenaerts and Patrick Verleg, Model Learning and Model Checking of SSH Implementations, Paul Fiterau-Brostean, Frits Vaandrager, Erik Poll, Joeri de Ruiter, Toon Lenaerts and Patrick Verleg, SPIN 2017, p. 142-151, ACM, 2017.
  [Models]
- Toon Lenaerts, Improving-protocol state fuzzing of SSH, Bachelor thesis, Radboud University, 2017.
- Patrick Verleg, Inferring SSH state machines using protocol state fuzzing, Master thesis, Radboud University Nijmegen, 2016.
OpenVPN
- Lesly-Ann Daniel, Inferring OpenVPN State Machines Using Protocol State Fuzzing, Internship report, University of Rennes 1 and ENS Rennes, 2017
  [Code]
IPSEC
- Bart Veldhuizen, Automated state machine learning of IPsec implementations, Bachelor thesis, Radboud University Nijmegen, 2017.

Smartcards and banking tokens

EMV (EuroPay-Mastercard-Visa)
- Fides Aarts, Joeri de Ruiter and Erik Poll, Formal Models of Bank Cards for Free, SECTEST 2013, pp. 461-468, IEEE, 2013.
  [Code & models]
  More info and models here.
EMV-based banking tokens
- Georg Chalupar, Stefan Peherstorfer, Erik Poll and Joeri de Ruiter, Automated Reverse Engineering using Lego, WOOT'14, USENIX, 2014.
e-passport
- Fides Aarts, Julien Schmaltz, Frits Vaandrager, Inference and abstraction of the biometric passport, ISoLa 2010, LNCS volume 6415, pp. pp 673-686, Springer, 2010.
  More info and models here.

Industrial Control Systems

IEC 60870-5-104
- Max Kerkers, Assessing the Security of IEC 60870-5-104 Implementations using Automata Learning, MSc thesis, University of Twente, 2017
  [Source code]