¶Concurrency and property-based testing

2023-08-04

Recently I have once again started noodling with an implementation of the Pastry peer-to-peer network architecture. None of the code is ready for prime-time yet, but an interesting tension came up that I thought was worth writing about.

Pastry

I'm using Go for this current implementation of Pastry. (I've dabbled in the past with C and Rust implementations, too.) I am focusing just as much effort on the testing code as on the implementation of the core algorithms. In particular, I'm leaning heavily on property-based testing using the Rapid library. In each test case, I use Rapid to instantiate random “universes” of nodes, and test various properties of the applications running on the overlay network.

Rapid

A real-world overlay network would exhibit inconsistent timings in a number of places: as messages are sent between nodes, as nodes process the messages that they receive, etc. (A real-world network would also have to handle failure of message delivery, but that's another concern that isn't relevant to this particular discussion.) To make sure that the test suite is exercising realistic behavior, it's important for the randomized test networks to exhibit these same timing inconsistencies.

I'm using Go for this current implementation of Pastry, which suggests a natural way to do that: have each test node run in a separate goroutine, and rely on the concurrency of the Go scheduler to interleave ordering and add delays in message delivery. I would then rely on the property test harness to perform enough iterations of each test to exercise all of the important interleavings and detect bugs that only show up with particular orderings or delays. I went ahead and coded it up this way, and it works well!

However, it turns out that this introduces two sources of randomness: the Go scheduler, and the property test harness. These two sources conflict with each other, since the property test harness assumes that it is the only source of randomness, and that each individual test case iteration (defined by a test function and a random seed) is perfectly reproducible. Without this reproducibility, the harness has a hard time minimizing test failures into small sets of inputs.

To solve this, it seems like I need to remove the goroutine concurrency, and instead have the test network implementation enforce a particular ordering and timing across all of the messages delivered as part of a test case. I will still want to exercise different orderings and timings, but I'll have to do so in a way that interacts well with the test harness — presumably by consuming its randomness source in some way.