Deterministic simulation testing foundations (madsim) for linera-core · linera-io/linera-protocol#6108

Repository metrics

Stars: (32,149 stars)
PR merge metrics: (Avg merge 2d 6h) (155 merged PRs in 30d)

Description

Parent tracker: #6107

Summary

Introduce madsim (Rust async runtime with deterministic simulation, drop-in tokio-compatible) to linera-core's test harness. Start with one representative test, validate determinism, expand module-by-module.

Motivation

Several recent bug classes would have been catchable — and reproducible via seed replay — under deterministic simulation testing (DST):

The cluster of correctness fixes following the #5790 chain-actors refactor: cancellation safety (#6056 merged; #6051 closed and superseded), Arc lifecycle in storage cache (#6046), save-failure handling (#6053), load-failure handling (#6015).
The 2026-04-12 Conway incident combining TTL inversion (#5991) with silent OTLP queue drops.
The cross-chain mpsc-loss issue on pod kill (see investigation-cross-chain-crash-safety.md).
multileader-stuck-round consensus stall.

Shared shape: a race / cancellation / ordering pattern reachable only under specific scheduling that normal tokio tests rarely produce. DST amplifies randomness and seeds it deterministically — a failing seed becomes a reproducible bug.

Non-obvious findings from initial investigation

madsim/turmoil/shuttle/loom are absent from linera-protocol, pm-app, pm-infra. Greenfield introduction.
madsim works via [patch]-replacement of tokio and related crates plus libc interception (gettimeofday, clock_gettime, getrandom, sysconf). Any third-party dep that reaches the real world through another path needs manual patching.
RisingWave uses madsim in production (see references). Polar Signals published a reference integration.

Scope

In scope:

Cargo feature / alias to build linera-core tests against madsim instead of tokio.
At least one representative linera-core test exercising block proposal, chain-worker I/O, and cross-chain messaging — running deterministically with seed replay.
Developer doc: how to add a test under the harness, how to reproduce a failing seed, which third-party deps are patched, current limitations.

Stretch (worth attempting if scope allows):

Reproduce one historical bug via seed replay. Useful as a capability proof but requires porting pre-fix code to the harness — can balloon scope. Demote if blocking.

Out of scope (follow-ups):

Full test-suite conversion. Validator-level simulation. Wasm guest simulation.

Constraints

Default (non-madsim) test path stays untouched.
Each linera-core dep that touches time/randomness/network is audited and tracked in an issue comment as work progresses.

Acceptance criteria

Cargo feature in place; CI runs both paths.
≥1 non-trivial linera-core test passes under madsim with ≥1000 random seeds.
Reproducer runbook: how to re-run a failing seed.
Developer guide covering harness usage + limitations.

Critical paths

linera-core/Cargo.toml — add madsim as dev-dep + [patch] block.
linera-core/tests/ — entry surface.
linera-core/src/chain_worker/ + linera-chain/src/chain.rs — first behaviors to cover.

References

Contributor guide

Research direction: Implement madsim as a drop in replacement for tokio in linera core tests. Start by patching Cargo.toml with madsim dev dependency and [patch] section, then port a single non trivial test (e.g., block proposal with cross chain messaging) to run deterministically under madsim with seed replay. Document the setup and reproduction steps.
Tech stack: rust
Domain: backendtestingdeveloper experience
Issue type: Research
Difficulty: 3
Estimated time: 1-2 days
Activity status: Fresh
Clarity: Clear
Prerequisites: RustCargoAsync programming
Newbie friendliness: 55