Verifying concurrent crash-safe systems

Last update: Dec 30, 2022

Comments: 17

Verifying concurrent, crash-safe systems with Perennial

Perennial is a system for verifying correctness for systems with both concurrency and crash-safety requirements, including recovery procedures. For example, think of file systems, concurrent write-ahead logging like Linux's jbd2 layer, and persistent key-value stores like RocksDB.

Perennial uses Goose to enable verification of programs written in (a subset of) Go.

This repository also includes a proof of the transaction system in goose-nfsd, including a simple NFS server built on top.

Compiling

We develop Perennial using Coq master and maintain compatibility with Coq 8.13. If CI (updated dependencies) is broken above Perennial should still compile but is currently incompatible with an upstream change to one of our dependencies. We try to avoid this situation.

This project uses git submodules to include several dependencies. You should run git submodule update --init --recursive to set that up.

To compile just run make with Coq on your $PATH.

We compile with coqc.py, a Python wrapper around coqc to get timing information. The wrapper requires Python3 and the argparse library. You can also compile without timing information with make TIMED=false.

Compilation times

Perennial takes about 70-80 CPU minutes to compile. Compiling in parallel with make -j4 is significantly faster, and can cut the time down to 20-25 minutes.

Incremental builds are better, after Iris and some core libraries are compiled.

When you make a change to a dependency, you can keep working without fully compiling the dependency by compiling vos interface files, which skips proofs. The simplest way to do this is just to run make vos, but it's fastest to pass a specific target, like make src/program_proof/wal/proof.required_vos, which only builds the vos dependencies to work on the wal/proof.v file.

If you're working on Goose and only need to re-check the Goose output, you can run make interpreter to run the interpreter's semantics tests, or directly compile just the Goose output files.

Coq also has a feature called vok files, where coqc compiles a vos file without requiring its dependencies to be built. The process does not produce a self-contained vo file, but emits an empty vok file to record that checking is complete. This allows checking individual files completely and in parallel. Using vos and vok files can significantly speed up the edit-compile-debug cycle. Note that technically vok checking isn't the same as regular compilation - it doesn't check universe constraints in the same way.

Source organization

src/

program_logic/ The main library that implements the crash safety reasoning in Perennial. This includes crash weakest preconditions, crash invariants, idempotence, and crash refinement reasoning.
goose_lang/ A lambda calculus with memory, concurrency, and an "FFI" to some external world, which can be instantiated to provide system calls specified using a relation over states.

This language is the target of Goose and thus models Go and also implements the Iris language and Perennial crash_lang interfaces for proofs using Perennial.

This directory includes a lifting to ghost state that supports more standard points-to facts, compared to the semantics which specifies transitions over the entire state. It also proves Hoare triples using these resources for the primitives of the language.
- typing.v A type system for GooseLang. This type system is used as part of the typed_mem/ machinery. TODO: there's much more to say here.
- lib/ GooseLang is partly a shallow embedding - many features of Go are implemented as implementations. These features are divided into several libraries. Each library has an implementation and a proof file. For example, map/impl.v implements operations maps using GooseLang's sums while map/map.v proves Hoare triples for the implementation. Separating the implementation allows us to run the implementation in the GooseLang interpreter without compiling the proofs.
  - typed_mem/ Implements support for flattening products over several contiguous locations, which is the foundation for supporting struct fields as independent entities. The proofs build up reasoning about the l ↦[t] v assertion, which says l points to a value v of type t (in the GooseLang type system). If t is a composite type, this assertion owns multiple physical locations.
  - struct/ Implements struct support. Structs are essentially tuples with names for the fields. The theorems proven here culminate in a way to split a typed points-to for a struct into its individual fields.
  - slice/ Implements Go slices using a tuple of an array, length, and capacity.
  - map/ Implements Go maps as a linked list of key-value pairs, terminating in a default value.
  - loop/ Implements a combinator for loops on top of the basic recursion support.
  - lock/ Implements locks and condition variables using a spin lock, which is implemented using CmpXchg.
  - encoding/ Implements uint64 and uint32 little-endian encoding.
- examples/ The Goose unit tests; these are auto-generated from the Goose repo, from internal/examples/.
- interpreter/ An interpreter for sequential GooseLang, equipped with a proof of partial correctness: if the interpreter produces a result, the semantics also can.
  
  This is used to implement tests in generated_test.v. These tests are Go functions which return booleans that should be true; we check this using Go test, and compare against the interpreter's behavior.
- ffi/
  
  Two different FFIs to plug into GooseLang - disk.v is the one we actually use, while append_log.v is the FFI-based specification for the append_log example.

program_proof/

The proofs about programs we have so far.
- append_log_proof.v Hoare triples about the append_log example, which is implemented in the Goose repo at internal/examples/append_log/ and extracted to goose_lang/examples/append_log/.
- examples/ Examples we wrote for POPL
- wal/, txn/, and buftxn/ proof of the transaction system library in goose-nfsd
- simple/ proof of a simple NFS server
Helpers/
- Integers.v Wrapper around coqutil's word library for u64, u32, and u8.
- Transitions.v A library for writing relations in a monadic, combinator style.
- NamedProps.v An Iris library for naming hypotheses within definitions and using them to automatically destruct propositions.
- other files are basically standard library extensions
algebra/

Additional CMRAs for Iris ghost variables

It's also worth noting that external/Goose contains committed copies of the Goose output on some real code we have. This includes github.com/tchajed/marshal and github.com/mit-pdos/goose-nfsd. The directory structure here mirrors the way Go import paths work.

Publications

Perennial 1 is described in our SOSP paper, "Verifying concurrent, crash-safe systems with Perennial". The actual codebase was quite different at the time of this paper; it notably used a shallow embedding of Goose and did not have WPCs or any of the associated program logic infrastructure.

Goose is briefly described in a CoqPL extended abstract and associated talk, "Verifying concurrent Go code in Coq with Goose".

The verified interpreter and test framework for Goose is described in Sydney Gibson's masters thesis, "Waddle: A proven interpreter and test framework for a subset of the Go semantics".

Owner

MIT PDOS

Parallel and Distributed Operating Systems group at MIT CSAIL

https://github.com/mit-pdos/perennial

Comments

Distriris

This is the first step towards supporting reasoning about distributed systems: we add a type global_state to the language interface, which is threaded through everywhere together with the (per-machine) state; and we add a global_state_interp as state interpretation for this new state. This is separate from state_interp because the latter will be per-machine in the adequacy proof, while global_state is, well, global.

goose_lang for now picks global_state := () and global_state_interp g := True. Later, this will be extensible via FFI in the same manner as state. Independently of that I will work on adding a notion of "executing a distributed system" and a corresponding adequacy theorem, but that should hopefully not incur further breakage elsewhere in Perennial.

@upamanyus @krawthekrow if you have anything not in ShouldBuild, it might be broken more or less horribly by this... @jtassarotti please check what I did in jrnl_ffi and more generally around the refinement and twophase stuff; I got it to build but I am not sure if my choice of definitions is always right. In particular in jrnl_ffi.v I sometimes destruct g to exploit that global_state is () and thus cannot possibly have changed.
Remove later in definition of na_crash_inv

This is an incomparable definition; it's weaker in that allocating a crash invariant now requires proving persistently(Q -* P) now instead of later, but it's stronger in that a crash invariant now implies persistently(Q -* P) now.

Note that if the guarantee P is timeless the later in the definition doesn't matter.
Fix misuse of modes for SemiSet

A wrong instance involving option unit was found otherwise at intros ?%elem_of_singleton. It seems the SemiSet (in stdpp?) modes are not set correctly, as this declaration is necessary for an unambiguous typeclass resolution in the proof script below. I'd be happy if you could merge this fix here and later submit it to upstream so as to not delay https://github.com/coq/coq/pull/10858 any further. The fix should be backwards compatible of course.
Adapt to a more-principled name generation in the "%" introduction pattern

Hi, this PR adapts to Coq PR coq/coq#13512 which makes the naming in the presence of the % introduction pattern compatible with name mangling. As a result, an H0 is now named H in perennial's file src/goose_lang/lib/struct/struct.v.

We fix the corresponding proof in a backward-compatible by not letting Coq generate the name. If you agree with the fix, it can be merged as soon as now.
Please update iris and stdpp again

Could you please update the submodules to stdpp@607ee2b1 and iris@5070f7f5 ? I've found a slowdown in perennial due to the change in stdpp for coq/coq#13969, which should be fixed now.
adjust for opaque pointer type

This is the Perennial side of https://github.com/tchajed/goose/pull/28. I re-goosed most things, except for perennial-examples which is not in a clean state currently (something related to async).

In typing.v there are some predicates that seem to be used only in the logrel; originally I meant ptrT to be consistent with structRefT but that imposes some new proof obligations in logical_reln_fund so instead of made ptrT return false for all of those... @jtassarotti is that okay?
Backward compatible fix

Due do coq/coq#14392, only heads of applications should not be evars for ! to match a term during typeclass resolution. This makes the iris automation more powerful, resulting in a breakage in the script.
Don't rely on ssreflect setoid rewrite bug (coq/coq#13882)

Before the mentioned Coq PR, "rewrite foo in H *" with ssreflect rewrite and a setoid equality would leave the goal with H reverted.

Instead we manually revert H and don't use "in".
Added later credit generation across pure steps

Added a tactic wp_pure1_credit "HcredName" that takes a single pure step and introduces a hypothesis with a later credit with the name "HcredName".

I minimal changes while enabling credit generation for pure steps. So, for instance, if we want later credit generation in the wp_load lemma in the future, we'll need to expose later credit generation in several additional wp_* lemmas in the various lifting.v files.
check on CI that Goose files are clean

Adds a little script and CI job to re-run Goose and check that git diff remains clean.

For now, Goose and the Go code repos are pinned to a particular commit each, but we could pin them to a branch as well.

I confirmed that before https://github.com/mit-pdos/perennial/commit/06834cebd125ceef376e547e471f80f770700f87, this indeed failed.
Add comment explaining scope annotations in atomic_fupd

Ralf wrote a comment explaining the explicit scope annotations on Mattermost and I thought it would make for good documentation (we might use this file in Aneris).
Simplify wptp_recv_strong_normal_adequacy

The lemma wptp_recv_strong_normal_adequacy quantifies over an argument ns. But, if ns is not the empty list the first assumption is false and the lemma is pretty vacuous.

This PR changes the lemma by, essentially, inlining ns = []. This simplifies the statement, its proof, and the places where the lemma is used.
Speed up compilation
I've wanted to improve compile times for a while. So far my attempts to do so have failed because I don't know exactly what is slow, other than program proofs.

The first problem is getting data: what's slow about compilation? We have per-file information, but we need some other measurement. Is it Ltac? If it's Ltac, is it something about the proof mode? Is it typeclass resolution?

Some things we can try:

Figure out how much time per file goes into Ltac vs everything else.

Attempt to get a whole-project Ltac profile.

Look at an OCaml profile of Coq compiling the whole project.

If it's the proof mode, one piece of useful infrastructure would be a synthetic benchmark that tests performance as some scaling factor increases, like the number of hypotheses or size of terms in the environment. Jason's projects (like reification by parametricity) are extremely good about these, which let them both track performance and identify asymptotic inefficiencies.
Develop an asynchronous inode resource to reason about buffered state
On top of the new async_inode, which implements an inode whose specification includes in-memory buffering, we should develop a resource-based specification for individual offsets in the inode.

Resources

A points-to fact for an offset in the inode. These are persistent because offsets don't change. We can think of the inode as being just a single block (the latest appended one), except that it persists all old versions; this makes the inode resemble the transaction system much more closely.

A monotonic durable lower-bound that specifies which index is persisted.

Rules for manipulating these resources

First, a points-to fact together with a lower-bound above that index should be durable, in some sense, in that it can be reconstructed after a crash. This may involve explicitly lifting from one "epoch" number to another, so that we can really relate one crash state to the next without destroying any state.

Secondly, it should be true that if one offset is present after a crash all previous ones are as well. This seems to involve some post-crash resource that says which index we crashed to; even if it was non-deterministically chosen, it's fixed after the initial choice.
Implement an asynchronous disk FFI and corresponding resources

We should set this up to start understanding how to deal with these non-deterministic crashes, particularly building up the right proof principles. The best test case would be to prove the simplest write-ahead log on top of an asynchronous disk and expose a synchronous, transactional API that doesn't have buffered writes.

The model is fairly easy: each disk block has a list of buffered possible crash values. Reads observe the latest write, and on crash we non-deterministically pick a buffered write to be the new value. This also gives a natural starting point for the resource algebra, which is just a current value and a set of crash values, per address.
Make sure allocation theorems actually apply after [simpl]

The various allocation methods (ref, ref_to, NewSlice, NewMap) sometimes get unfolded and then the nice typed theorems about them don't apply. We should make sure they use opaque representations which are values (in some cases they were incorrectly made expressions, which have to be simplified to substitute through them).
Write a new typed slice library with a clean abstraction
The slice library is rather large, which makes it painful to shadow and re-export everything with types.

I think perhaps a better design is to think of a slice as naming a set of locations, without owning pointers for those locations. Then, we can instantiate a predicate per location; this subsumes the current is_slice_small and updates_slice, and similar notions. Subslicing just changes the set of locations. If this turns out to be difficult or not useful, we can revert to essentially the untyped slice library but ported to have types.

This involves doing the following:

[ ] try out a new is_slice predicate that does not consume space but can be combined with a predicate per location

[ ] implement the current API using the new predicate

[ ] port all users of the untyped library to the typed version

[ ] rename is_slice_small to is_slice and is_slice to is_slice_append

[ ] use slice_cap separately where possible

Go concurrent-safe, goroutine-safe, thread-safe queue

goconcurrentqueue - Concurrent safe queues The package goconcurrentqueue offers a public interface Queue with methods for a queue. It comes with multi

Dec 31, 2022

Crash your app in style (Golang)

panicparse Parses panic stack traces, densifies and deduplicates goroutines with similar stack traces. Helps debugging crashes and deadlocks in heavil

Jan 5, 2023

This is the course materials for the Go Data Structures Crash Course!

Go Data Structures Course ?? Welcome Gophers! This is the official repository that contains all of the data structures we cover in the Go Data Structu

May 10, 2022

Crash Course about the programming language Go / Golang.

Crash Course about the programming language Go / Golang. In this course I have covered some important concepts and topics in programming.

Oct 10, 2022

Rps-game-in-go - Learn Go for Beginners Crash Course (Golang)

rps-game-in-go This rock-paper-scissors game was based on the Udemy course "Lear

Mar 20, 2022

This is an concurrent-queue and concurrent-stack lib for the go.

This is an concurrent-queue and concurrent-stack lib for the go. Getting Started Pull in the dependency go get github.com/boobusy/vector Add the impor

Jan 2, 2022

a thread-safe concurrent map for go

concurrent map As explained here and here, the map type in Go doesn't support concurrent reads and writes. concurrent-map provides a high-performance

Jan 8, 2023

Multithreaded key value pair store using thread safe locking mechanism allowing concurrent reads

Project Amnesia A Multi-threaded key-value pair store using thread safe locking mechanism allowing concurrent reads. Curious to Try it out?? Check out

Oct 29, 2022

A thread-safe concurrent map for go

concurrent map Original repo didn't support go mod and no any tags,so I forkd this repo add go mod support and patch a tag on this repo. No any code c

Dec 7, 2021

A Go test assertion library for verifying that two representations of JSON are semantically equal

jsonassert is a Go test assertion library for verifying that two representations of JSON are semantically equal. Usage Create a new *jsonassert.Assert

Jan 4, 2023

A Go (golang) library for parsing and verifying versions and version constraints.

go-version is a library for parsing versions and version constraints, and verifying versions against a set of constraints. go-version can sort a collection of versions properly, handles prerelease/beta versions, can increment versions, etc.

Jan 9, 2023

🔮 ✈️ to integrate OPA Gatekeeper's new ExternalData feature with cosign to determine whether the images are valid by verifying their signatures

?? ✈️ to integrate OPA Gatekeeper's new ExternalData feature with cosign to determine whether the images are valid by verifying their signatures

Dec 8, 2022

🔮 ✈️ to integrate OPA Gatekeeper's new ExternalData feature with cosign to determine whether the images are valid by verifying their signatures

cosign-gatekeeper-provider To integrate OPA Gatekeeper's new ExternalData feature with cosign to determine whether the images are valid by verifying i

Dec 8, 2022

A pledge(2) and unveil(2)'d tool for verifying GnuPG signatures.

ogvt A pledge(2) and unveil(2)'d tool for verifying GnuPG signatures. Success ./ogvt -file test/uptime.txt -sig test/uptime.txt.asc -pub test/adent.p

Nov 25, 2021

Hostkeydns - Library for verifying remote ssh keys using DNS and SSHFP resource records

hostkeydns import "suah.dev/hostkeydns" Package hostkeydns facilitates verifying

Feb 11, 2022

Allows verifying client's phone number

hone verification module for Elling - Elytrium Billing

Jun 4, 2022

Supports the safe and convenient execution of asynchronous computations with goroutines and provides facilities for the safe retrieval of the computation results.

Rendezvous The Rendezvous library supports the safe and convenient execution of asynchronous computations with goroutines and provides facilities for

Dec 29, 2021

Verifying concurrent crash-safe systems

Verifying concurrent, crash-safe systems with Perennial

Compiling

Compilation times

Source organization

Publications

Owner

MIT PDOS

Comments

Distriris

Remove later in definition of na_crash_inv

Fix misuse of modes for SemiSet

Adapt to a more-principled name generation in the "%" introduction pattern

Please update iris and stdpp again

adjust for opaque pointer type

Backward compatible fix

Don't rely on ssreflect setoid rewrite bug (coq/coq#13882)

Added later credit generation across pure steps

check on CI that Goose files are clean

Add comment explaining scope annotations in atomic_fupd

Simplify wptp_recv_strong_normal_adequacy

Speed up compilation

Develop an asynchronous inode resource to reason about buffered state

Resources

Rules for manipulating these resources

Implement an asynchronous disk FFI and corresponding resources

Make sure allocation theorems actually apply after [simpl]

Write a new typed slice library with a clean abstraction

Related tags

Go concurrent-safe, goroutine-safe, thread-safe queue

Crash your app in style (Golang)

This is the course materials for the Go Data Structures Crash Course!

Crash Course about the programming language Go / Golang.

Rps-game-in-go - Learn Go for Beginners Crash Course (Golang)

This is an concurrent-queue and concurrent-stack lib for the go.

a thread-safe concurrent map for go

Multithreaded key value pair store using thread safe locking mechanism allowing concurrent reads

A thread-safe concurrent map for go

A Go test assertion library for verifying that two representations of JSON are semantically equal

A Go (golang) library for parsing and verifying versions and version constraints.

🔮 ✈️ to integrate OPA Gatekeeper's new ExternalData feature with cosign to determine whether the images are valid by verifying their signatures

🔮 ✈️ to integrate OPA Gatekeeper's new ExternalData feature with cosign to determine whether the images are valid by verifying their signatures

A pledge(2) and unveil(2)'d tool for verifying GnuPG signatures.

Hostkeydns - Library for verifying remote ssh keys using DNS and SSHFP resource records

Allows verifying client's phone number

Supports the safe and convenient execution of asynchronous computations with goroutines and provides facilities for the safe retrieval of the computation results.

Concurrent task runner, developer's routine tasks automation toolkit. Simple modern alternative to GNU Make 🧰

Concurrent ssh client

Highly concurrent drop-in replacement for bufio.Writer