Bonus — Fuzzing & Invariant Testing: Advanced Patterns

“The unit test asks: ‘does this function do what I wrote it to do?‘. The invariant asks: ‘does this protocol still make sense after a million random people interact with it in an order I never imagined?‘. Auditors who get paid to find bugs that humans miss live in the second question. Tools just help them stay there longer.”

Tags: web3-security tool foundry echidna medusa fuzzing invariant methodology vault erc4626 Learner: Has Tuan-04-Security-Foundations-CEI-AC, Tuan-05-Vulnerability-Classes-Part-1, and Tuan-15-Audit-Methodology-Tooling §6 + §9 under their belt. Wants to go from “I can write an invariant_* function” to “I can stand up a production-grade stateful fuzz harness for an unknown protocol in a day, and reason about what it cannot find.” Time: 3–5 working days (depending on prior Foundry depth) Related: Tuan-15-Audit-Methodology-Tooling · Tuan-Bonus-Formal-Verification-Deep · Tuan-08-DeFi-Security-AMM-Lending-Vault · invariant-test-templates · lab-dev-environment


1. Context & Why

1.1 The plateau this lesson exists to break

After Week 15 you can answer “what’s an invariant?” and “what’s a handler?“. You can write invariant_solvency() against a vault and run forge test --invariant-runs 256. Tool finds nothing. Audit shipped. Three months later a bug ships that the fuzzer “should have” caught.

What went wrong is almost never the engine. It’s one of:

  1. You fuzzed the wrong surface — handler called the wrong functions, with the wrong actor distribution, in the wrong order.
  2. Your invariants were too weak — they describe the happy path the developer already tested for, not the edge the auditor is paid to find.
  3. You ran too short — 256 runs × 100 depth is a demo setting. Real catches need millions of sequences, with corpus persistence.
  4. You missed a degree of freedom — single-actor harness, no time advance, fixed asset price, no donations, no rounding-mode adversary. The bug needs two of those at once.
  5. You didn’t read the failures — a “no failures” run is a negative result. The result is only as strong as the surface you exercised.

This lesson is the upgrade from “tool user” to “auditor who runs the tool”. Half of it is mechanics. The other half is the audit-budget discipline that makes mechanics pay off.

1.2 The unifying frame

A fuzz/invariant harness is a falsification engine. Your job:

Specify the predicate you believe must hold across all reachable states, then build the smallest harness that gives the engine the right freedoms to falsify it.

If the predicate is too narrow (e.g., totalSupply() >= 0 on a uint), the engine wastes runs and finds nothing — predicate is tautological. If the harness is too narrow (single actor, fixed price, single function), the engine cannot reach the violating state — predicate is unreachable through this surface, not unfalsifiable. Auditors confuse these constantly; the second is more common and more dangerous because the test passes with high confidence and zero coverage of the real failure mode.

The Trail of Bits methodology slogan distills this: “finding the right invariants is half the work; building the right handler is the other half.” Running the tool is mechanical.

1.3 Learning goals

By the end you can:

  • State the difference between stateless and stateful fuzzing, and pick the right one for a given target.
  • Build a handler-based Foundry invariant suite with ghost variables, multi-actor pranking, bounded inputs, and targetSelector discipline — for an unknown protocol — in under 3 hours.
  • Write the same suite as an Echidna properties + assertion harness and explain trade-offs.
  • Configure Medusa for parallel mainnet-fork fuzzing and explain when it beats Echidna.
  • Read a forge invariant failure trace and reproduce it as a deterministic Foundry unit test.
  • Manage a fuzz corpus across audit and remediation phases.
  • Allocate audit-time across invariant discovery / handler authoring / run / triage — without one phase eating the others.
  • Recognise the four shapes of bug a fuzzer cannot find (and so know when to escalate to Tuan-Bonus-Formal-Verification-Deep).

1.4 Primary references

SourceURLStatus
Foundry Book — Invariant Testinghttps://book.getfoundry.sh/forge/invariant-testingCurrent
Foundry Book — Fuzz Testinghttps://book.getfoundry.sh/forge/fuzz-testingCurrent
forge-std StdInvariant + boundhttps://github.com/foundry-rs/forge-std/blob/master/src/StdInvariant.solCurrent
Echidna README + tutorialhttps://github.com/crytic/echidna and https://github.com/crytic/echidna/blob/master/tests/README.mdCurrent; v2.3.x
Medusa README + docshttps://github.com/crytic/medusa and https://secure-contracts.com/program-analysis/medusa/Current; v1.5.x [verify]
Trail of Bits — Building Secure Contracts: Fuzzing chapterhttps://github.com/crytic/building-secure-contracts/tree/master/program-analysisCurrent — canonical methodology
ToB blog — Echidna postshttps://blog.trailofbits.com/category/echidna/ (e.g., “How to fuzz like a pro” series)Multiple posts; check 2023–2025 entries
Recon (getrecon.xyz)https://getrecon.xyz/ and https://book.getrecon.xyz/Current — invariant test scaffolding service / open-source helpers
Spearbit / Cantina blog — invariant postshttps://cantina.xyz/blog (search “invariant”) and https://spearbit.com/researchSeveral relevant posts; check 2024–2026 entries
Sample real-world invariantsAave V3 properties (https://github.com/aave/aave-v3-origin/tree/main/tests/certora and ToB review repos); Maker dss invariants; Liquity properties; Compound III testsBrowse for inspiration
Echidna paper (FSE 2020)“Echidna: effective, usable, and fast fuzzing for smart contracts” by Grieco et al.Background reading, not required
Halmoshttps://github.com/a16z/halmosCross-ref; symbolic complement to fuzzing

[verify] Tool versions move. Pin in .tool-versions / Docker image for reproducibility.


2. Foundations — Stateless vs Stateful Fuzzing

2.1 The two regimes

A fuzzer chooses inputs randomly within a search space. The shape of that space is what separates the regimes.

Stateless fuzzing (a.k.a. property-based testing à la QuickCheck):

// Foundry stateless fuzz
function testFuzz_addCommutes(uint256 a, uint256 b) public pure {
    assertEq(a + b, b + a);
}

Each run starts from a fresh state. One function call. The property either holds or doesn’t, for this one call. Good for:

  • Pure math libraries (mulDiv, fixed-point ops, curve formulas).
  • Single-function pre/post conditions.
  • Encoding/decoding round-trip checks.

Stateful fuzzing: each run is a sequence of calls against a deployed contract. The engine drives the contract through many transitions, then evaluates the invariant after each transition (or at the end of the sequence).

// Foundry stateful invariant (sketch)
function invariant_solvency() public view {
    assert(vault.totalAssets() >= ghost_depositSum - ghost_withdrawSum);
}

Stateful is the regime that matches real protocols. Most real bugs are stateful — they require a particular sequence of transitions to surface (deposit, donate, withdraw, accrue; or borrow, accrue, liquidate, repeat).

2.2 When stateless is enough

You almost always still write some stateless tests, even in a stateful-heavy suite:

  • For libraries (FixedPointMathLib, TickMath, OracleLib) — one-shot mathematical invariants.
  • For bounded helpers — “this encoding always round-trips”.
  • As fast feedback during development; stateless tests are 10–100× faster per run.

The mistake is the opposite: writing only stateless tests for a stateful protocol. You verified every leaf function in isolation but never asked “what if leaf A runs, then leaf B, then leaf A again, with a re-entry between?“.

2.3 The invariant as the specification

In Week 15 §6 you learned to enumerate invariants. Here we treat the list as the specification of correctness, against which fuzzing produces evidence.

The grading rubric for a candidate invariant before you encode it:

QuestionIf “no” →
Is it falsifiable in principle?Drop it; it’s a tautology.
Can the fuzzer reach a state where it might fail?Fix the handler, not the invariant.
Is the predicate precise (exact equality vs ≈ with tolerance)?Decide and document the tolerance now.
Does it depend on off-chain or external state (oracle, time)?Add mocks or ghost variables that capture that state.
Is “passes” meaningful, or does it pass trivially because handler never moves state?Add an invariant_callSummary to confirm the surface is exercised.

A list of high-grade invariants is a deliverable by itself — many engagements pay for the invariant catalogue alone. (Recon and similar services bundle this as a product.)

2.4 Coverage-guided vs blind random

Modern fuzzers are coverage-guided: they instrument execution to detect when a new branch is taken, save those inputs in a corpus, and prefer mutations of corpus entries on future runs. This is enormously more effective than purely random input — the engine learns which inputs reach deeper code.

  • Echidna is coverage-guided since v1.4. Coverage data persists in corpusDir/.
  • Medusa is coverage-guided and parallel; each worker shares corpus.
  • Foundry’s invariant runner is less coverage-guided in the same sense — it does mutational fuzzing but its corpus model is weaker. This is the single most-cited reason to also run Echidna/Medusa for deep-coverage exploration.

Practical consequence: persist your corpus across runs. Don’t rm -rf corpus/ between sessions; you’re throwing away hours of warm-up.


3. Foundry Invariant Testing — Mechanics

3.1 The data flow

flowchart LR
    A[setUp] --> B[Pick run]
    B --> C[Pick depth N]
    C --> D[For i in 1..N:<br>pick target contract<br>pick selector<br>pick params]
    D --> E[Call target.fn<br>params]
    E --> F[All invariant_*<br>functions pass?]
    F -- yes --> G{i < N?}
    F -- no --> H[Shrink<br>then fail]
    G -- yes --> D
    G -- no --> I[End run]
    I --> J{runs left?}
    J -- yes --> B
    J -- no --> K[Pass]
    style F fill:#fff2cc
    style H fill:#ffcccc
    style K fill:#caffbf

Key knobs:

  • runs (foundry.toml [invariant] section, or --invariant-runs N): number of independent random sequences attempted.
  • depth: length of each sequence (number of calls before invariants are re-checked at the end and — by default — after every call).
  • fail_on_revert: if false, reverted calls are skipped (sequence continues); if true, any unexpected revert fails the run.
  • shrink_run_limit: how hard to try shrinking a failing sequence down to a minimal reproducer.
  • call_override: whether handler-detected vm.assume violations skip the call (advanced).

A typical mature foundry.toml:

[invariant]
runs = 1000
depth = 250
fail_on_revert = true
shrink_run_limit = 10000
 
[fuzz]
runs = 10000
seed = "0xdeadbeef"  # pin for reproducibility during a triage session

For overnight runs scale runs to 50_000+ and depth to 500+. The compute cost is linear; the bug detection probability rises with the combined search size (and shape of corpus).

3.2 targetContract, targetSelector, excludeContract

By default, the invariant runner picks calls randomly across all known contracts in the test environment. For anything non-trivial this is too broad — most calls revert, surface coverage is shallow.

Restrict the surface:

function setUp() public {
    handler = new VaultHandler(vault, asset);
    targetContract(address(handler));               // only call into the handler
 
    // Further restrict to specific handler functions
    bytes4[] memory selectors = new bytes4[](4);
    selectors[0] = VaultHandler.deposit.selector;
    selectors[1] = VaultHandler.withdraw.selector;
    selectors[2] = VaultHandler.transferShares.selector;
    selectors[3] = VaultHandler.donate.selector;
    targetSelector(FuzzSelector({ addr: address(handler), selectors: selectors }));
 
    // Exclude known-broken or out-of-scope contracts
    excludeContract(address(mockToken));
}

There’s also targetSender(addr) (which sender accounts to use) and excludeSender(addr). We prefer to manage senders inside the handler with useActor() modifiers (see §3.5) for explicit control, but the cheatcodes exist.

Audit smell — wildcard targets: a setUp that calls targetContract(address(vault)) directly is almost always wrong for a non-trivial protocol. Every external function on the vault becomes a fuzz target with raw uint256 parameters; most reverts on the first require(amt <= balance). Coverage is illusory. Use a handler.

3.3 bound() vs vm.assume()

Two ways to constrain fuzz inputs:

// Bound: maps raw uint256 to a valid range
function deposit(uint256 amtSeed) external {
    uint256 amt = bound(amtSeed, 1, 1e24);  // every input becomes valid
    vault.deposit(amt, msg.sender);
}
 
// Assume: discards inputs outside the range
function deposit(uint256 amt) external {
    vm.assume(amt > 0 && amt < 1e24);  // most fuzz inputs discarded
    vault.deposit(amt, msg.sender);
}

Almost always use bound. Reasons:

  1. Every fuzz run produces a valid call. vm.assume discards, wasting iterations. With a tight assume, 99% of inputs are discarded and your effective sequence depth collapses.
  2. The mapping is deterministic. Same seed → same bounded value, so shrinking and reproduction work.
  3. No depth-budget bias. Discarded calls still count against engine budgets in some configurations.

Reserve vm.assume for genuinely sparse predicates (“address is not a precompile”) where mapping isn’t natural.

bound is defined in forge-std/StdUtils.sol. It uses ((x - lo) % (hi - lo + 1)) + lo semantics — preserves uniformity over the input space within [lo, hi].

3.4 Ghost variables — tracking what the protocol doesn’t

The protocol’s storage holds “what the contract knows”. Ghost variables hold “what an honest accountant would know, watching from outside”.

Examples:

GhostTracks
ghost_depositSumCumulative assets deposited across all calls
ghost_withdrawSumCumulative assets withdrawn
ghost_perActorDeposits[a]Per-actor cumulative deposits
ghost_lastSharePriceMost-recent share price; checked monotone after every transition
ghost_donationsDonations-via-direct-transfer (separate from deposit)
ghost_calls["deposit"]Number of times each function was called
ghost_reverts["deposit"]Number of unexpected reverts per function

Convention: prefix ghost storage with ghost_, make them public so invariants can read them, update them in the handler immediately after the protocol call succeeds.

The invariants then take the form:

function invariant_solvent() public view {
    assertGe(
        vault.totalAssets(),
        handler.ghost_depositSum() - handler.ghost_withdrawSum(),
        "vault assets below net deposits"
    );
}
 
function invariant_per_actor_share_consistent() public view {
    address[] memory actors = handler.getActors();
    for (uint i = 0; i < actors.length; i++) {
        uint256 expectedAssets = handler.ghost_perActorDeposits(actors[i]);
        uint256 sharesOwned    = vault.balanceOf(actors[i]);
        uint256 assetsRedeemable = vault.previewRedeem(sharesOwned);
        // Allow rounding loss bounded by share count
        assertApproxEqAbs(assetsRedeemable, expectedAssets, sharesOwned + 1);
    }
}

The key insight: the ghost is the source of truth the invariant compares the protocol against. Without ghosts, you can only assert internal consistency (which is what unit tests already do). With ghosts, you assert external consistency — the protocol’s reported state matches what an outsider knows happened.

3.5 Multi-actor handler pattern

A single-msg.sender handler misses entire bug classes — first-depositor donation, share dilution, MEV-like ordering attacks all require ≥2 distinct callers.

contract VaultHandler is Test {
    Vault public vault;
    MockToken public asset;
    address[] public actors;
    address internal currentActor;
    uint256 public actorIndex;
 
    modifier useActor(uint256 actorSeed) {
        actorIndex = bound(actorSeed, 0, actors.length - 1);
        currentActor = actors[actorIndex];
        vm.startPrank(currentActor);
        _;
        vm.stopPrank();
    }
 
    constructor(Vault _v, MockToken _a) {
        vault = _v;
        asset = _a;
        // 5 distinct addresses, each with funded balances
        for (uint i = 0; i < 5; i++) {
            address u = address(uint160(0x1000 + i));
            actors.push(u);
            vm.deal(u, 1000 ether);
            asset.mint(u, 1e30);
            vm.prank(u);
            asset.approve(address(vault), type(uint256).max);
        }
    }
 
    function deposit(uint256 actorSeed, uint256 amt) external useActor(actorSeed) {
        amt = bound(amt, 1, asset.balanceOf(currentActor));
        if (amt == 0) return;
        try vault.deposit(amt, currentActor) {
            ghost_depositSum += amt;
            ghost_perActorDeposits[currentActor] += amt;
            ghost_calls["deposit"]++;
        } catch {
            ghost_reverts["deposit"]++;
        }
    }
 
    function getActors() external view returns (address[] memory) { return actors; }
}

Now any sequence can interleave actions across actors. Donation attacks (where actor A inflates totalAssets to make actor B’s deposit round to zero shares) become reachable.

3.6 Time and block advances

Many protocols are time-sensitive (interest accrual, vesting, expiries, rebases). Default Foundry fuzzing doesn’t advance time. Add handler functions:

function warp(uint256 seed) external {
    uint256 jump = bound(seed, 1, 30 days);
    vm.warp(block.timestamp + jump);
    ghost_warps++;
}
 
function roll(uint256 seed) external {
    uint256 jump = bound(seed, 1, 100_000);  // ~ months of blocks
    vm.roll(block.number + jump);
}

Add warp.selector and roll.selector to your targetSelector list. Now sequences can include long-time advances and reveal accrual / expiry bugs.

Note: in fail_on_revert = true mode, ensure your time advances don’t push state past expiries that would cause subsequent valid-looking calls to revert. Either cap the jump or model expiries explicitly.

3.7 Reading invariant failures

A failing run prints something like:

[FAIL: assertion failed: vault assets below net deposits]
        [Sequence]
                sender=0x1000000000000000000000000000000000000003
                addr=[handler]
                calldata=deposit(seed=0x..., amt=0x...)
                args=[actor_3, 999_999_999_999_999_999_999]
                ---
                sender=0x1000000000000000000000000000000000000001
                addr=[handler]
                calldata=donate(amt=0x...)
                args=[1_000_000_000_000]
                ---
                sender=0x1000000000000000000000000000000000000003
                addr=[handler]
                calldata=withdraw(seed=0x..., shares=0x...)
                args=[actor_3, max]
                ---
        Invariant invariant_solvent (handler runs: 1234, calls: 3)

The sequence is the minimal reproducing trace (after shrinking). Each line is a call.

Workflow:

  1. Reproduce as a deterministic unit test. Copy the sequence verbatim into a test_repro_* function with literal arguments. Confirm it fails. This frees you from random-seed flake during fix iteration.
  2. Understand the trace. What state was each actor in before/after each call?
  3. Decide: bug in invariant or bug in code? Sometimes the invariant is over-stated (e.g., forgot rounding tolerance). More often the code is wrong.
  4. Fix → re-run invariant suite. Same seed, but also re-run with new randomness.

Pin the seed during a triage session (seed = "0xdeadbeef" in [fuzz] section, or --invariant-seed) so the failure reproduces deterministically while you iterate. Unpin for production runs.

3.8 The Recon scaffolding workflow

Recon (and their open-source chimera framework) provides an opinionated scaffold for invariant suites that codifies the patterns above:

  • A Setup.sol (deploys everything).
  • A TargetFunctions.sol (one function per protocol function; each maps a fuzzed seed → bounded call).
  • A Properties.sol (the invariants).
  • A CryticToFoundry.t.sol (run the same suite via Foundry).
  • A BeforeAfter.sol helper (pre/post state diffing for assertions like “share price never decreased”).

The advantage: the same harness drives Foundry, Echidna, and Medusa with minimal divergence. Many production audits now ship this layout; it’s worth modelling your own scaffold on it even if you don’t use Recon’s hosted service.


4. Invariant Catalogues by Protocol Type

Cross-reference Tuan-15-Audit-Methodology-Tooling §6.3 for the short list. Below are richer catalogues per class, formatted for direct lift into a fuzz suite.

4.1 ERC-4626 vault (the Lab target)

#InvariantNotes
V1convertToShares(convertToAssets(x)) ≤ x and convertToAssets(convertToShares(x)) ≤ x (both directions)Rounding direction must lose to vault, not user, per ERC-4626 §“Conversion”
V2totalSupply() == 0 iff totalAssets() == 0 (or virtual-shares offset documented)First-depositor / inflation attack vector
V3After deposit(a, r): shares received ≥ previewDeposit(a) (preview is a lower bound)Spec wording
V4After withdraw(a, r, o): shares burned ≤ previewWithdraw(a)Spec wording
V5Share price (totalAssets / totalSupply) is monotone non-decreasing while no losses occurIf strategy can lose, weaken; document
V6sum(balanceOf(actor) for actor in actors) == totalSupply()Accounting consistency
V7totalAssets() >= ghost_depositSum - ghost_withdrawSum (modulo realized profit ≥ 0)Solvency
V8maxRedeem(actor) * sharePrice ≈ balanceOf(actor) in asset terms (within rounding)No surprise locks
V9After full withdraw (balanceOf(actor) == 0), actor’s recoverable claim is 0No residual claims
V10First depositor cannot grief later depositors: round-zero-shares attack mitigatedDonation-then-deposit sequence
V11previewDeposit(a) is monotone in aPricing sanity
V12Repeated previewDeposit(a) returns the same value within the same block & stateIdempotence

Bug V10 is the classic first-depositor inflation attack — the donation-attack family that hit Cream, multiple 4626 forks, and is the reason OpenZeppelin’s 4626 implementation uses a “virtual shares” offset by default. A fuzz suite that doesn’t include a donate() handler function will miss it.

4.2 AMM (constant-product)

#InvariantNotes
A1reserveA * reserveB ≥ k_prev after every swap (k can only grow due to fees)The k-invariant
A2sum(LP shares) == LP_supplyAccounting
A3LP_supply == 0 iff reserveA == 0 && reserveB == 0Boundary
A4After a swap: IERC20(tokenA).balanceOf(pool) == reserveA + amountInNo silent token escape
A5sync() followed by any read returns balances matching reservesSync correctness
A6Two consecutive swaps in opposite directions, ignoring fees, return the original state up to fee accumulationReversibility under no-fee model
A7After mint + burn for the same LP, asset out ≤ asset in (LP can’t profit from no-op)LP fairness
A8getAmountOut(x, rA, rB) is monotone non-decreasing in xPricing monotonicity
A9No swap can decrease the pool’s own token balance below zeroUnderflow safety

The Uniswap-V2 swap function’s k-invariant is the example used in nearly every Echidna tutorial — see ToB’s “Echidna 2.0” announcement.

4.3 Lending market

#Invariant
L1sum(borrows) ≤ sum(collateralValue * collateralFactor) system-wide
L2For any user, healthFactor < 1e18 ⟹ position is liquidatable (no liquidation guard locks it out)
L3Liquidation of a HF < threshold position is profitable to the liquidator (close-factor × bonus ≥ gas)
L4totalBorrows accrues interest non-decreasing over time
L5Reserve factor: protocol’s accrued reserves = interest paid − interest distributed to suppliers
L6Per-asset borrow cap never exceeded
L7repay(amt) reduces borrowOf(user) by exactly amt (modulo accrual rounding)
L8After full withdraw(collateral), user has no remaining borrow against that collateral
L9cash + sum(borrows) == totalSupply * exchangeRate (Compound-style accounting identity)

Aave V3’s published Certora rules and ToB’s Compound III review are excellent property catalogues to study; many translate to Echidna properties directly.

4.4 Stablecoin (CDP-based)

#Invariant
S1sum(collateral_USD) ≥ sum(debt) (system over-collateralised)
S2Stability fee accrual is monotone non-decreasing over time
S3A vault below liquidationRatio is liquidatable by anyone
S4Surplus = revenue − bad debt, exact (no leakage to unaccounted addresses)
S5redeem and mint operations are conservative: total stablecoin supply matches accounted minted minus burned
S6Oracle price-update window: protocol uses staleness-checked price, never reverts silently

4.5 Bridge / message-passing

#Invariant
B1lockedOnSource(asset) == mintedOnDest(asset) for every asset, modulo in-flight
B2Each messageHash is processed at most once on the destination
B3Only messages whose source-chain block is past finalized are processed
B4Validator-set rotation requires N-of-M signatures of the current set on the new set
B5Pause on either chain pauses the other (or matches spec)

(Discussed deeply in Tuan-10-Bridge-Cross-Chain-Security; replay protection in particular is fuzz-suitable.)

The 4626 catalogue (§4.1) is what we’ll encode in §8 (the Lab). The rest are reference; use them when auditing those protocol shapes.


5. Echidna — Deep Dive

5.1 Property mode vs assertion mode

Echidna supports two test modes. Pick by testMode: in the YAML config or --test-mode flag.

Property mode (default, classic):

contract TestERC20 is ERC20Vulnerable {
    function echidna_supply_constant() public view returns (bool) {
        return totalSupply() == INITIAL_SUPPLY;
    }
}

Echidna calls all public non-view functions of the contract under test, then evaluates every echidna_* view function. Returning false = failure.

Assertion mode:

function deposit(uint256 amt) public {
    uint256 priceBefore = sharePrice();
    _deposit(amt);
    uint256 priceAfter = sharePrice();
    assert(priceAfter >= priceBefore);  // Echidna catches the failure
}

Echidna treats any failing assert() (or revert with Panic(0x01)) as a failure. More flexible — you can assert per-call invariants in the middle of an operation, not just at the end.

Trade-offPropertyAssertion
Invariant locationEnd of sequenceInline, after any call
Composition with library codeLibrary code can’t assert into your test contractLibrary assert() is picked up everywhere
Cleanliness of reportOne named function per invariantAsserts must include enough context in the file/line
PerformanceSlightly faster (single eval per seq)Per-call eval

In practice: use property mode for protocol-level invariants (“sum of balances == total supply”), assertion mode for per-call pre/post relationships (“share price never decreases on this op”). A single test contract can use both.

5.2 Configuration deep-dive

A production echidna.yaml:

# What we're testing
testMode: assertion             # or "property", or "dapptest"
prefix: "echidna_"              # property-mode prefix
testLimit: 1000000              # total iterations
seqLen: 200                     # tx per sequence
shrinkLimit: 50000              # shrinking effort
corpusDir: corpus/              # persistent across runs
coverage: true                  # coverage-guided fuzzing (default true)
codeSize: 0xFFFFFF              # max contract size
 
# Senders
sender: [
    "0x10000",
    "0x20000",
    "0x30000"
]                               # the set of addresses Echidna pranks as
 
# Limits
balanceAddr: 0xFFFFFFFF
balanceContract: 0xFFFFFFFF
maxGasPerBlock: 30000000        # block-level gas cap
 
# Filtering
filterFunctions:                # functions Echidna may NOT call
    - "ERC20Vulnerable.transfer(address,uint256)"
filterBlacklist: true           # treat list as blacklist; false = whitelist
 
# Output
format: text                    # text | json | none
quiet: false
testDestruction: true           # test selfdestruct removes contract
 
# Initial calls (state setup if not done via constructor)
initialize: null

Two knobs that change behaviour, not just performance:

  • seqLen: longer sequences explore deeper state machines but each iteration is slower. Default 100, raise to 200–500 for stateful protocols, lower to 50 for simple ones.
  • testLimit: how many sequences to try. 100k for development, 1M+ for overnight.

Two knobs that only affect performance / coverage:

  • shrinkLimit: higher = smaller failing trace, more time to print. Don’t over-tune; the default is reasonable.
  • corpusDir: critical. Set it once and forget; let Echidna build a deep corpus across runs.

5.3 Multi-actor (multiple senders)

Echidna’s sender: config controls which addresses can call into the contract. Set 3–5 senders for any non-trivial fuzz. The actor selection is part of the search; Echidna will pick which sender and which function to maximise coverage.

sender: [
    "0x10000",
    "0x20000",
    "0x30000",
    "0x40000",
    "0x50000"
]

Inside the contract, balance these actors before fuzzing starts (in the constructor or initialize). Without funded balances, every call reverts on the first transfer.

5.4 Slither integration

Echidna invokes crytic-compile, which understands Foundry, Hardhat, Truffle. Behind the scenes Slither is used for type information (knowing function signatures, public state vars, modifiers).

You can also pre-run Slither’s property-generation features:

slither-prop . --contract Vault --etheno-format

This generates a starter set of properties (e.g., totalSupply non-negative, balances sum). Treat the output as inspiration, not as the final list — auto-generated properties are usually too weak.

5.5 Reading Echidna output

A failing test:

echidna_share_price_monotone: FAILED!💥
  Call sequence:
    setUp()
    deposit(0x10000, 1000000)  from: 0x10000
    donate(500000)              from: 0x20000
    withdraw(0x10000, 500000)   from: 0x10000
    assertEq(sharePrice(), prevSharePrice())  -- value: 999, prev: 1000

Unique instructions: 4234
Unique codehashes:   3
Corpus size:         128
Seed:                12345678

The “call sequence” is the minimised failing trace. Always reproduce it as a Foundry unit test before fixing — it’s the contract.

Save the seed; you can re-run with --seed 12345678 to reproduce.

5.6 Common pitfalls

  • Unfunded senders. Senders need ETH (or token balances) for any meaningful call. Fund in constructor.
  • Properties evaluated after revert-only sequences. If every call reverts, properties hold trivially. Echidna prints unique codehashes: 0 or similar — if you see “0 unique” you’re not exercising anything.
  • Tightly coupled to Echidna version semantics. Some properties of the engine (default seqLen, default shrinkLimit) have changed across versions. Pin a version in CI: echidna --version.
  • Solidity version mismatch. Echidna installs a solc version; your project may want a different one. Use crytic-compile’s --solc flag or solc-select to align.

6. Medusa — When and Why

6.1 Origin and differences

Medusa is Trail of Bits’ Go-based fuzzer, designed to be a more modern, parallel, fork-capable successor to Echidna. Same compilation pipeline (crytic-compile), similar property/assertion modes, very different engine.

DimensionEchidnaMedusa
LanguageHaskellGo
ParallelismSingle-threadedMulti-worker, parallel
Mainnet forkLimited / not first-classFirst-class via go-ethereum
Corpus formatEchidna’s binary corpusJSON, human-readable
Property prefixechidna_medusa_ (or configurable)
Active development (2026)MaintainedMore active for new features [verify]

6.2 Configuration (medusa.json)

{
  "fuzzing": {
    "workers": 10,
    "workerResetLimit": 50,
    "timeout": 0,
    "testLimit": 0,
    "callSequenceLength": 100,
    "corpusDirectory": "medusa-corpus",
    "coverageEnabled": true,
    "targetContracts": ["VaultHandler"],
    "targetContractsBalances": [],
    "constructorArgs": {},
    "deployerAddress": "0x30000",
    "senderAddresses": ["0x10000", "0x20000", "0x30000", "0x40000"],
    "blockNumberDelayMax": 60480,
    "blockTimestampDelayMax": 604800,
    "blockGasLimit": 125000000,
    "transactionGasLimit": 12500000
  },
  "compilation": {
    "platform": "crytic-compile",
    "platformConfig": {
      "target": ".",
      "solcVersion": "",
      "exportDirectory": "",
      "args": []
    }
  },
  "logging": {
    "level": "info",
    "logDirectory": ""
  },
  "testing": {
    "stopOnFailedTest": false,
    "stopOnFailedContractMatching": false,
    "testAllContracts": false,
    "traceAll": false,
    "assertionTesting": {
      "enabled": true,
      "testViewMethods": false,
      "panicCodeConfig": {
        "failOnCompilerInsertedPanic": false,
        "failOnAssertion": true,
        "failOnArithmeticUnderflowOverflow": true,
        "failOnDivideByZero": true,
        "failOnEnumTypeConversionOutOfBounds": false,
        "failOnIncorrectStorageAccess": false,
        "failOnPopEmptyArray": false,
        "failOnOutOfBoundsArrayAccess": false,
        "failOnUninitializedFunctionPointerCall": false
      }
    },
    "propertyTesting": {
      "enabled": true,
      "testPrefixes": ["echidna_", "medusa_", "property_"]
    },
    "optimizationTesting": {
      "enabled": false,
      "testPrefixes": ["optimize_"]
    }
  }
}

Notable: Medusa lets you treat individual Solidity panic codes as failures (arith overflow, div by zero, array OOB). You can effectively ask “does any reachable execution trigger a panic 0x11?” — very useful for catching subtle overflows even after Solidity 0.8’s default checks.

6.3 When to choose Medusa over Echidna

Choose Medusa when:

  • You need mainnet-fork fuzzing (e.g., the protocol integrates with live Uniswap, Aave, Chainlink — and mocking would lose fidelity).
  • You have many cores and want parallel speedup (10 workers → ~10× testLimit per wall-clock hour).
  • You want a single tool whose corpus is human-readable JSON (easier to inspect, share, diff).
  • You’re starting greenfield in 2025+ — Medusa is the more active default.

Choose Echidna when:

  • The project already has an Echidna corpus; switching loses warm-up.
  • You’re relying on Echidna-specific features (e.g., dapptest mode, certain shrinking behaviour).
  • A specific bug class is reproduced more reliably by Echidna’s mutation strategy (rare but it happens; A/B and observe).

Practical: run both. The corpus directories are separate but the harness code is shared (you write Solidity once). Schedule:

  • Per PR: Foundry invariant tests with small runs (CI gate, <5 min).
  • Per night: Echidna and Medusa each, on a beefier machine, hours of runtime, persistent corpus.
  • Per audit: pre-engagement and during fuzzing day, both engines on full corpus, ≥24 wall-clock-hour budget.

6.4 Output and replay

Medusa prints failure traces in JSON. Each is reproducible. Same workflow as Echidna: copy into a deterministic Foundry test before fixing.

medusa fuzz --config medusa.json
# On failure:
medusa fuzz --config medusa.json --test-limit 1 --corpus-replay-directory medusa-corpus/failed

7. Corpus Management & Reproducibility

7.1 Why the corpus matters

A coverage-guided fuzzer’s effectiveness compounds with corpus quality. A 100k-run cold start typically finds 50–80% of the bugs a 100k-run warm start finds. Across an audit and remediation cycle, your corpus is data — preserve it.

7.2 Practices

  • Commit corpus to git (or to an audit-internal artifact store) at the end of each fuzzing session.
  • Tag corpora: corpus/echidna-pre-fix/, corpus/echidna-post-fix/. After a fix, restart from the pre-fix corpus to confirm the bug is not reachable anymore (the engine will quickly re-find it if the fix is incomplete).
  • Share between engines: Recon-style scaffolds let Echidna and Medusa share the same scenarios via the unified handler. Their internal corpora diverge but the entry-points-set is the same.
  • Beware Solidity version bumps invalidating the corpus. Compiler-emitted bytecode changes can shift coverage instrumentation; an old corpus may not map cleanly to new code. Re-seed if you bump majors.

7.3 Replay against new code

Standard practice during remediation review:

  1. Take the pre-fix corpus.
  2. Compile the post-fix code.
  3. Run Echidna/Medusa with --no-coverage (or equivalent), feeding the old corpus sequences against the new code.
  4. Confirm: the original failing sequence now passes; no new invariants fail.
  5. Then run a fresh coverage-guided session to look for new bugs the fix may have introduced.

This is the single highest-ROI fuzzing activity during remediation. If the fix is correct, step 4 takes minutes. If it’s incomplete, you find out before signing off.

7.4 Determinism gotchas

Fuzzing isn’t perfectly deterministic out of the box. Sources of nondeterminism:

  • Random seed: pin with --seed N (Echidna) or --fuzz-seed (Medusa) / config.
  • Wall-clock-based timeouts: a slow machine may explore less in the same wall-clock budget than a fast one; pin testLimit not timeout.
  • Worker scheduling (Medusa): multi-worker exploration order varies; pin workers: 1 for fully deterministic reproduction. Use multi-worker only for production search.

8. Lab — Build a Complete Invariant Suite for an ERC-4626 Vault

This is the central exercise. Expect 1 full day for the Foundry suite, half a day for the Echidna port, half a day for the bug-plant and detection comparison.

8.1 Setup

mkdir -p ~/web3-sec-lab/bonus-fuzz && cd ~/web3-sec-lab/bonus-fuzz
forge init --no-commit vault4626-fuzz
cd vault4626-fuzz
forge install OpenZeppelin/openzeppelin-contracts

In foundry.toml:

[profile.default]
solc_version = "0.8.24"
optimizer = true
optimizer_runs = 200
src = "src"
test = "test"
remappings = [
    "openzeppelin/=lib/openzeppelin-contracts/contracts/",
    "forge-std/=lib/forge-std/src/"
]
 
[invariant]
runs = 256
depth = 200
fail_on_revert = true
shrink_run_limit = 10000
 
[invariant.deep]
runs = 5000
depth = 500
fail_on_revert = true
shrink_run_limit = 50000

forge invariant will use the default config; FOUNDRY_PROFILE=deep forge invariant will use the heavier settings for overnight runs.

8.2 Target contract — a minimal Vault4626

// src/Vault4626.sol
// SPDX-License-Identifier: MIT
pragma solidity ^0.8.24;
 
import {ERC20} from "openzeppelin/token/ERC20/ERC20.sol";
import {IERC20} from "openzeppelin/token/ERC20/IERC20.sol";
import {SafeERC20} from "openzeppelin/token/ERC20/utils/SafeERC20.sol";
 
/// @notice Minimal ERC-4626-ish vault for fuzz lab.
/// Deliberately *does not* use OZ's 4626 (which has virtual-shares mitigation).
contract Vault4626 is ERC20 {
    using SafeERC20 for IERC20;
 
    IERC20 public immutable asset;
 
    constructor(IERC20 _asset) ERC20("Vault Shares", "vSHARE") {
        asset = _asset;
    }
 
    function totalAssets() public view returns (uint256) {
        return asset.balanceOf(address(this));
    }
 
    function convertToShares(uint256 assets) public view returns (uint256) {
        uint256 supply = totalSupply();
        return (supply == 0) ? assets : (assets * supply) / totalAssets();
    }
 
    function convertToAssets(uint256 shares) public view returns (uint256) {
        uint256 supply = totalSupply();
        return (supply == 0) ? shares : (shares * totalAssets()) / supply;
    }
 
    function previewDeposit(uint256 assets) public view returns (uint256) {
        return convertToShares(assets);
    }
 
    function previewRedeem(uint256 shares) public view returns (uint256) {
        return convertToAssets(shares);
    }
 
    function deposit(uint256 assets, address receiver) external returns (uint256 shares) {
        shares = previewDeposit(assets);
        require(shares > 0, "ZERO_SHARES");
        asset.safeTransferFrom(msg.sender, address(this), assets);
        _mint(receiver, shares);
    }
 
    function redeem(uint256 shares, address receiver, address owner) external returns (uint256 assets) {
        require(msg.sender == owner, "ONLY_OWNER");
        assets = previewRedeem(shares);
        _burn(owner, shares);
        asset.safeTransfer(receiver, assets);
    }
}

Plus a MockERC20:

// src/MockERC20.sol
// SPDX-License-Identifier: MIT
pragma solidity ^0.8.24;
 
import {ERC20} from "openzeppelin/token/ERC20/ERC20.sol";
 
contract MockERC20 is ERC20 {
    constructor() ERC20("Mock", "MOCK") {}
    function mint(address to, uint256 amt) external { _mint(to, amt); }
}

This vault has intentional weaknesses — no virtual shares, integer rounding always against the vault’s net interest. Good. We want the fuzzer to find them.

8.3 The handler

// test/invariant/VaultHandler.sol
// SPDX-License-Identifier: MIT
pragma solidity ^0.8.24;
 
import {Test, console2} from "forge-std/Test.sol";
import {Vault4626} from "src/Vault4626.sol";
import {MockERC20} from "src/MockERC20.sol";
 
contract VaultHandler is Test {
    Vault4626 public vault;
    MockERC20 public asset;
 
    // --- Actors ---
    address[] public actors;
    address internal currentActor;
 
    // --- Ghost variables ---
    uint256 public ghost_depositSum;
    uint256 public ghost_withdrawSum;
    uint256 public ghost_donationSum;
    mapping(address => uint256) public ghost_actorDeposited;
    mapping(address => uint256) public ghost_actorWithdrawn;
    mapping(bytes32 => uint256) public ghost_calls;
    mapping(bytes32 => uint256) public ghost_reverts;
    uint256 public ghost_lastSharePrice;
 
    constructor(Vault4626 _v, MockERC20 _a) {
        vault = _v;
        asset = _a;
        for (uint256 i = 0; i < 5; i++) {
            address u = address(uint160(0xCAFE0000 + i));
            actors.push(u);
            asset.mint(u, 1e36);
            vm.startPrank(u);
            asset.approve(address(vault), type(uint256).max);
            vm.stopPrank();
        }
        ghost_lastSharePrice = 1e18;  // initial 1:1 assumption
    }
 
    modifier useActor(uint256 actorSeed) {
        currentActor = actors[bound(actorSeed, 0, actors.length - 1)];
        vm.startPrank(currentActor);
        _;
        vm.stopPrank();
    }
 
    modifier countCall(bytes32 key) {
        ghost_calls[key]++;
        _;
    }
 
    function _sharePrice() internal view returns (uint256) {
        uint256 supply = vault.totalSupply();
        if (supply == 0) return 1e18;
        return (vault.totalAssets() * 1e18) / supply;
    }
 
    function deposit(uint256 actorSeed, uint256 amt)
        external
        useActor(actorSeed)
        countCall("deposit")
    {
        amt = bound(amt, 1, asset.balanceOf(currentActor));
        if (amt == 0) {
            ghost_reverts["deposit"]++;
            return;
        }
        // Predict zero-shares revert and skip
        if (vault.previewDeposit(amt) == 0) {
            ghost_reverts["deposit"]++;
            return;
        }
        try vault.deposit(amt, currentActor) {
            ghost_depositSum += amt;
            ghost_actorDeposited[currentActor] += amt;
        } catch {
            ghost_reverts["deposit"]++;
        }
    }
 
    function redeem(uint256 actorSeed, uint256 sharesSeed)
        external
        useActor(actorSeed)
        countCall("redeem")
    {
        uint256 maxShares = vault.balanceOf(currentActor);
        if (maxShares == 0) {
            ghost_reverts["redeem"]++;
            return;
        }
        uint256 shares = bound(sharesSeed, 1, maxShares);
        uint256 assetsOut;
        try vault.redeem(shares, currentActor, currentActor) returns (uint256 a) {
            assetsOut = a;
            ghost_withdrawSum += assetsOut;
            ghost_actorWithdrawn[currentActor] += assetsOut;
        } catch {
            ghost_reverts["redeem"]++;
        }
    }
 
    /// @notice Direct token transfer to the vault — simulates "donation attack" vector.
    function donate(uint256 actorSeed, uint256 amt)
        external
        useActor(actorSeed)
        countCall("donate")
    {
        amt = bound(amt, 1, asset.balanceOf(currentActor));
        asset.transfer(address(vault), amt);
        ghost_donationSum += amt;
    }
 
    function warp(uint256 seed) external countCall("warp") {
        uint256 jump = bound(seed, 1, 30 days);
        vm.warp(block.timestamp + jump);
    }
 
    function getActors() external view returns (address[] memory) {
        return actors;
    }
 
    // Print call distribution at the end of each invariant run
    function callSummary() external view {
        console2.log("deposit:",  ghost_calls["deposit"]);
        console2.log("redeem:",   ghost_calls["redeem"]);
        console2.log("donate:",   ghost_calls["donate"]);
        console2.log("warp:",     ghost_calls["warp"]);
        console2.log("reverts.deposit:", ghost_reverts["deposit"]);
        console2.log("reverts.redeem:",  ghost_reverts["redeem"]);
        console2.log("ghost_depositSum:",  ghost_depositSum);
        console2.log("ghost_withdrawSum:", ghost_withdrawSum);
        console2.log("ghost_donationSum:", ghost_donationSum);
    }
}

8.4 The invariant test contract

// test/invariant/Vault4626.invariant.t.sol
// SPDX-License-Identifier: MIT
pragma solidity ^0.8.24;
 
import {Test, StdInvariant} from "forge-std/Test.sol";
import {Vault4626} from "src/Vault4626.sol";
import {MockERC20} from "src/MockERC20.sol";
import {VaultHandler} from "./VaultHandler.sol";
 
contract Vault4626InvariantTest is StdInvariant, Test {
    Vault4626 internal vault;
    MockERC20 internal asset;
    VaultHandler internal handler;
 
    function setUp() public {
        asset   = new MockERC20();
        vault   = new Vault4626(asset);
        handler = new VaultHandler(vault, asset);
 
        targetContract(address(handler));
 
        bytes4[] memory s = new bytes4[](4);
        s[0] = VaultHandler.deposit.selector;
        s[1] = VaultHandler.redeem.selector;
        s[2] = VaultHandler.donate.selector;
        s[3] = VaultHandler.warp.selector;
        targetSelector(FuzzSelector({ addr: address(handler), selectors: s }));
    }
 
    // V7 — Solvency (net of donations and any future yield)
    function invariant_V7_solvent() public view {
        // totalAssets >= net deposits − net withdrawals
        // Donations only INCREASE the LHS, so the inequality is still ≥.
        uint256 lhs = vault.totalAssets();
        uint256 rhs = handler.ghost_depositSum() - handler.ghost_withdrawSum();
        assertGe(lhs, rhs, "V7: vault not solvent vs net deposits");
    }
 
    // V2 — Zero-supply iff zero-assets (relaxed: zero-supply implies assets ≤ donations only)
    function invariant_V2_supply_assets_pair() public view {
        if (vault.totalSupply() == 0) {
            // The only way assets can be present with zero supply is direct donation.
            assertLe(
                vault.totalAssets(),
                handler.ghost_donationSum() + handler.ghost_depositSum() - handler.ghost_withdrawSum(),
                "V2: assets exceed accounted flows"
            );
        }
    }
 
    // V6 — Sum of actor balances == totalSupply
    function invariant_V6_supply_matches_actor_sum() public view {
        address[] memory a = handler.getActors();
        uint256 sum;
        for (uint256 i = 0; i < a.length; i++) sum += vault.balanceOf(a[i]);
        assertEq(sum, vault.totalSupply(), "V6: actor balance sum != totalSupply");
    }
 
    // V8 — previewRedeem(balance) approximates ghost_actorDeposited - ghost_actorWithdrawn,
    //      bounded by share-count rounding.
    function invariant_V8_per_actor_claim() public view {
        address[] memory a = handler.getActors();
        for (uint256 i = 0; i < a.length; i++) {
            uint256 shares = vault.balanceOf(a[i]);
            uint256 claim  = vault.previewRedeem(shares);
            int256 net = int256(handler.ghost_actorDeposited(a[i]))
                       - int256(handler.ghost_actorWithdrawn(a[i]));
            if (net < 0) net = 0;
            // Each share can lose at most 1 wei to rounding; total claim ≥ net − shares.
            // Donations can only INCREASE per-share claim, never decrease it.
            assertGe(
                claim + shares,           // +shares = rounding tolerance
                uint256(net),
                "V8: actor claim below net deposits beyond rounding"
            );
        }
    }
 
    // V11 — previewDeposit monotone in amt
    function invariant_V11_preview_monotone() public view {
        uint256 a = 1e18;
        uint256 b = 2e18;
        assertLe(vault.previewDeposit(a), vault.previewDeposit(b), "V11: preview not monotone");
    }
 
    // Diagnostic — not an invariant, but a call distribution log
    function invariant_callSummary() public view {
        handler.callSummary();
    }
}

Five real invariants (V2, V6, V7, V8, V11) plus a diagnostic. Run:

forge test --match-contract Vault4626InvariantTest -vv
# Deep run:
FOUNDRY_PROFILE=deep forge test --match-contract Vault4626InvariantTest -vvv

Expected: on this naive vault, V2 and V8 will likely fail under specific sequences (donation-then-deposit shifts the share price, distorting per-actor accounting; depending on rounding the suite catches the first-depositor inflation attack).

8.5 Diagnose a failure — first depositor / donation attack

A characteristic failing trace:

[FAIL: V8: actor claim below net deposits beyond rounding]
        [Sequence]
                actor=0xCAFE0000, deposit(amt = 1)
                actor=0xCAFE0001, donate(amt = 1e18)
                actor=0xCAFE0002, deposit(amt = 1e18)
                ...
                actor=0xCAFE0002, redeem(shares = balanceOf(actor))

Trace:

  1. Actor 0 deposits 1 wei → gets 1 share (totalSupply 0 → shares = assets).
  2. Actor 1 donates 1e18 → totalAssets = 1e18+1, totalSupply = 1.
  3. Actor 2 deposits 1e18 → previewDeposit(1e18) = 1e18 * 1 / (1e18+1) = 0 (Solidity floor div). Without our zero-share guard this would revert. With the guard the handler skips, but if we remove the guard or if a different sequence pushes shares to non-zero-but-tiny, actor 2 mints 1 share but contributes 1e18 asset.
  4. Actor 2 redeems 1 share → gets 1 * (1e18+1+1e18) / 2 ≈ 1.5e18 asset — more than they put in, at actor 0’s expense.

This is the classic first-depositor / donation attack. Documented fix: virtual shares offset (OZ’s 4626 default since ~v4.9). Without it, any 4626 vault has this issue.

In our handler we guard against the immediate revert, but the economic invariant V8 still fails because the rounding direction shifts value between actors. That’s the bug.

8.6 Patch and re-run

Fix by upgrading to OZ’s ERC4626.sol, which uses the decimal-offset virtual-shares mitigation:

// src/Vault4626.sol (patched)
import {ERC4626, IERC20Metadata} from "openzeppelin/token/ERC20/extensions/ERC4626.sol";
 
contract Vault4626 is ERC4626 {
    constructor(IERC20Metadata _asset)
        ERC4626(_asset)
        ERC20("Vault Shares", "vSHARE") {}
 
    function _decimalsOffset() internal pure override returns (uint8) {
        return 1;  // 10× virtual shares; pick 2–6 depending on threat model
    }
}

Re-run. V8 should now pass. The donation attack still inflates totalAssets, but per-actor claim no longer collapses to zero because virtual shares dominate the ratio at small supply.

8.7 Port to Echidna

// test/echidna/EchidnaVault.sol
// SPDX-License-Identifier: MIT
pragma solidity ^0.8.24;
 
import {Vault4626} from "src/Vault4626.sol";
import {MockERC20} from "src/MockERC20.sol";
 
contract EchidnaVault {
    Vault4626 public vault;
    MockERC20 public asset;
 
    address constant ACTOR_A = address(0x10000);
    address constant ACTOR_B = address(0x20000);
    address constant ACTOR_C = address(0x30000);
 
    uint256 internal ghost_depositSum;
    uint256 internal ghost_withdrawSum;
    uint256 internal ghost_donationSum;
 
    constructor() {
        asset = new MockERC20();
        vault = new Vault4626(asset);
        asset.mint(ACTOR_A, 1e30);
        asset.mint(ACTOR_B, 1e30);
        asset.mint(ACTOR_C, 1e30);
    }
 
    function deposit(uint8 actorIdx, uint256 amt) public {
        address actor = _actor(actorIdx);
        amt = amt % asset.balanceOf(actor);
        if (amt == 0) return;
        if (vault.previewDeposit(amt) == 0) return;
 
        // Mock prank: approve and call as actor via low-level
        // Echidna fuzzes msg.sender via sender config; here we model differently.
        // For brevity, this test runs as the EOA echidna picks; in real harness use sender config.
        asset.approve(address(vault), amt);
        try vault.deposit(amt, msg.sender) {
            ghost_depositSum += amt;
        } catch {}
    }
 
    function redeem(uint256 shares) public {
        uint256 max = vault.balanceOf(msg.sender);
        if (max == 0) return;
        shares = shares % (max + 1);
        if (shares == 0) return;
        try vault.redeem(shares, msg.sender, msg.sender) returns (uint256 a) {
            ghost_withdrawSum += a;
        } catch {}
    }
 
    function donate(uint256 amt) public {
        amt = amt % asset.balanceOf(msg.sender);
        if (amt == 0) return;
        asset.transfer(address(vault), amt);
        ghost_donationSum += amt;
    }
 
    function _actor(uint8 idx) internal pure returns (address) {
        if (idx % 3 == 0) return ACTOR_A;
        if (idx % 3 == 1) return ACTOR_B;
        return ACTOR_C;
    }
 
    // --- Properties ---
 
    function echidna_solvent() public view returns (bool) {
        return vault.totalAssets() >= ghost_depositSum - ghost_withdrawSum;
    }
 
    function echidna_supply_matches_assets_dir() public view returns (bool) {
        if (vault.totalSupply() == 0) {
            return vault.totalAssets() <= ghost_donationSum + ghost_depositSum - ghost_withdrawSum;
        }
        return true;
    }
 
    function echidna_preview_monotone() public view returns (bool) {
        return vault.previewDeposit(1e18) <= vault.previewDeposit(2e18);
    }
}

echidna.yaml:

testMode: property
prefix: "echidna_"
testLimit: 500000
seqLen: 100
corpusDir: corpus-echidna/
coverage: true
sender: ["0x10000", "0x20000", "0x30000"]
balanceAddr: 0xffffffff
balanceContract: 0xffffffff

Run:

echidna test/echidna/EchidnaVault.sol --contract EchidnaVault --config echidna.yaml

8.8 Plant a bug, measure time-to-detection

Now the experiment: plant a subtle bug, run both engines, record how long each takes to find it.

Bug to plant: flip the rounding direction in convertToShares to favour the user (round up instead of down).

function convertToShares(uint256 assets) public view returns (uint256) {
    uint256 supply = totalSupply();
    if (supply == 0) return assets;
    // BUG: round UP instead of down
    return (assets * supply + totalAssets() - 1) / totalAssets();
}

This now violates V7 — convertToShares rounds in the user’s favour, so on deposit users can mint slightly more shares than the asset they brought. Over many deposits, totalAssets() drifts below ghost_depositSum - ghost_withdrawSum.

Expected results (illustrative — record your own):

EngineConfigurationTime to first failure
Foundry invariantruns=256, depth=200<1 minute typically; tight ghost makes it obvious
Foundry invariantruns=5000, depth=500seconds, with shrinking
EchidnatestLimit=500000, seqLen=10010s–2min depending on hardware
Medusaworkers=10, callSequenceLength=100seconds (parallelism)

Document the time-to-detection in your lab notes; it’s the empirical evidence that the suite would have caught this class of bug.

8.9 Plant a stealthier bug — off-by-one in totalAssets

Try a harder one: make totalAssets() return asset.balanceOf(address(this)) - 1. Subtle, intentional fee-skim shape.

Now V7 may not trigger directly (donations mask it), but a tighter invariant like:

function invariant_V7b_strict_solvent() public view {
    assertGe(
        vault.totalAssets(),
        handler.ghost_depositSum() - handler.ghost_withdrawSum(),
        "V7b strict: assets < net deposits"
    );
}

…requires V7’s tolerance be zero (no rounding fudge) and the handler avoid donations between the deposit and the check. The lesson: the invariant must be tight enough to detect the bug, the handler loose enough to reach it.

This is the auditor’s craft. There is no general algorithm.

8.10 Stretch lab — fork mode with Medusa

For the same vault, point Medusa at a real ERC-20 (e.g., DAI on mainnet) by enabling fork mode:

"fuzzing": {
  ...
  "rpcUrl": "https://eth-mainnet.g.alchemy.com/v2/<key>",
  "rpcBlock": 19000000
}

Now MockERC20 is replaced by mainnet DAI. The vault interacts with a real token contract. This catches bugs that depend on specific token behaviour (USDT-style no-return, fee-on-transfer, etc.). Echidna’s fork support is more limited; Medusa shines here.


9. Time-Budget Allocation

Within a fixed-scope audit (say, 10 working days, with fuzzing budgeted at 15% = ~1.5 days), how do you spend the time?

9.1 The 10/30/30/30 split

PhaseTime shareWhat you do
Setup10%Install tooling, scaffold harness skeleton, wire the protocol’s deploy code into a Foundry test, write a Setup.sol.
Invariant identification30%Read docs + spec + tests. List invariants per module. Grade them (see §2.3). Drop weak ones, sharpen the rest. Add the protocol-specific ones nobody wrote down.
Handler authoring30%Write the handler. Bound inputs. Add ghost variables. Wire multi-actor pranking. Run a sanity check (invariant_callSummary confirms exercises). Iterate until the call distribution looks balanced and reverts are explained.
Run + triage30%Run Foundry invariant on default; then Echidna + Medusa overnight; triage failures; reproduce as unit tests; classify (bug vs invariant-too-strong); write up.

The two pitfalls are at the ends:

  • Spending 70% on setup: you finish with a beautiful harness and no time to triage. Triage is where bugs become findings.
  • Spending 70% on run + triage: implies invariants were too weak (catches nothing) or handler too narrow (catches false positives). Stop and reinvest in §2 / §3.

9.2 Daily structure for a fuzzing-heavy day

09:00 - 10:30   Skim docs and spec; jot 5–10 invariants on a whiteboard.
10:30 - 12:30   Sharpen invariants. Translate to `invariant_*` headers (no body yet).
12:30 - 13:00   Lunch.
13:00 - 16:00   Write handler. Bound inputs. Add ghosts. Multi-actor.
16:00 - 17:00   First `forge invariant` run on default profile. Inspect callSummary.
17:00 - 17:30   Tighten any obviously over-reverting handler functions.
17:30 - 18:00   Kick off overnight `FOUNDRY_PROFILE=deep` + Medusa run.

(overnight)     Engines run.

09:00 next day  Triage failures. Reproduce. Classify. Repeat.

Spending one full day per major module on this structure is the right pace for a thoroughly-fuzzed audit.

9.3 What “done” looks like

A fuzzing phase is done when:

  • All invariants pass at deep profile (runs=5000, depth=500) on Foundry and 1M iterations on Echidna/Medusa.
  • The callSummary shows a healthy distribution (every handler function called at least 5% of the time).
  • Revert rate per handler function is <30% (otherwise the bound is too narrow or there’s an unmodeled precondition).
  • The corpus is committed.
  • All failed runs that occurred during exploration were either fixed in code (with a finding written), or had the invariant adjusted with a documented reason.

If invariants all “pass” but callSummary shows deposit: 5000, redeem: 0, you didn’t fuzz, you ran a single-function loop. Re-tighten and re-run.


10. What Fuzzing Cannot Find

Fuzzing has hard limits. Knowing them tells you when to escalate to symbolic execution (Tuan-Bonus-Formal-Verification-Deep) or accept the residual risk.

10.1 Properties over universals ()

A fuzzer samples; a prover quantifies. If your property is “for all possible inputs”, a fuzzer can only show “I tried N inputs and none failed”. A symbolic tool (Halmos, Certora) can show “no input exists that violates this”.

Examples a fuzzer won’t reliably prove:

  • mulDiv(a, b, c) never overflows for any (a, b, c) in a given range.” — Symbolic answers yes/no exactly; fuzzing answers “didn’t find one”.
  • “Function X always reverts unless caller has role Y.” — Symbolic checks every call path; fuzzing might miss the one path that bypasses.

For pure-math libraries, prefer Halmos. For library-level properties, fuzz + Halmos in parallel.

10.2 Sequence-shape bugs requiring exact ordering

Some bugs require an exact 12-call sequence with specific inter-call constraints — flash-loan-then-deposit-then-oracle-update-then-liquidate. A blind sequence fuzzer rarely stumbles into that exact sequence at random. Coverage guidance helps, but for very deep paths, directed testing beats coverage-guided.

Mitigation: write a guided fuzz handler where one function explicitly performs the “interesting prefix” (e.g., a setupFlashLoanScenario() helper) and the engine fuzzes around the rest.

10.3 Specification gaps

If the spec doesn’t say “user X can never be DoS’d by user Y”, you won’t write the invariant, and the fuzzer won’t check it. The bug is in the list, not in the engine. This is why §2 and §6 weight invariant identification at 30% of the budget.

10.4 External / oracle / cross-chain assumptions

A fuzzer can only randomise what the harness exposes. If the harness uses a fixed oracle price, the engine never asks “what if the oracle was manipulated?“. Add a setOraclePrice(uint) handler function with bounded inputs.

Cross-chain message flows are similarly unmodellable without a multi-chain harness — typically done with mock relayers + finality-window manipulation. Possible but a non-trivial extension.

10.5 Liveness / progress properties

Fuzzers can check safety properties (“X never happens”) but not liveness (“Y eventually happens”). For DeFi this matters less than in distributed systems, but examples:

  • “A position with HF < 1 is eventually liquidated by some honest actor.” — Liveness, fuzz-incompatible.
  • “After pause(), users can always withdraw via emergency path.” — Reachability property; can fuzz, but a positive result requires you to actually reach the state, not just not violate it.

For liveness, escalate to manual reasoning, game-theoretic analysis, or — at the edge — model checking.


11. Integrating Fuzzing with the Audit Lifecycle

11.1 Pre-engagement

  • Ask the client: “Do you have a fuzz suite? If yes, share it.” Often the answer is “we have unit tests”; that’s fine. The fuzz suite becomes a deliverable.
  • Estimate +1–2 auditor-days for fuzzing per module if there’s no existing harness.
  • Specify in the scoping doc: “Fuzz suite will be delivered as tests/invariant/* and tests/echidna/* directories; will run under forge test --invariant and echidna ....”

11.2 During audit

  • Stand up the harness on Day 2 after the codebase walk-through. Don’t wait.
  • Run a low-profile (runs=256) suite continuously while you do manual review. Use it as a smoke test for “did my mental model just break?“.
  • Whenever manual review surfaces a suspected invariant, encode it. If the fuzzer falsifies it, you have a finding; if it holds, you have a property to document in the report.

11.3 Post-audit / remediation

  • Deliver the fuzz suite with the report. Clients increasingly expect this.
  • On remediation, re-run the harness against patched code. Confirm previously-failing properties now hold. Confirm no new failures arose.
  • The corpus is part of the deliverable — many clients want to keep running the fuzz suite in CI after the audit ends.

11.4 Continuous fuzzing in CI

For projects that adopt continuous fuzzing:

  • Run forge test --invariant on every PR with default profile (gates merges, <5 min).
  • Run Echidna or Medusa nightly on main with deep profile + persistent corpus.
  • Alert on any failure with the reproducing sequence attached.

12. Anti-patterns

Add to the master checklist when reviewing someone else’s fuzz suite (or your own):

  • targetContract set to the protocol contract directly. No handler. Coverage is shallow.
  • vm.assume used to constrain large ranges (e.g., vm.assume(x > 0 && x < 1e24)). Use bound instead.
  • Single-actor harness. All calls msg.sender == address(this). Misses multi-party invariants.
  • fail_on_revert = false left on in final config. Hides handler bugs.
  • No callSummary-style diagnostic. No visibility into actual call distribution.
  • Ghost variables not updated after every relevant call. Invariants compare stale data; false negatives.
  • runs left at default for CI. Default is fast but shallow; deep profile must exist separately.
  • No persistent corpus for Echidna / Medusa. Cold start every run.
  • Invariant predicates that are tautologies (assertGe(totalSupply(), 0)).
  • No time-advance handler functions. Misses accrual / vesting / expiry bugs.
  • No donation / direct-transfer handler. Misses first-depositor inflation, fee-on-transfer integration bugs, accidental-receive overhang.
  • Invariant catalogue not delivered. The list is a deliverable; without it, the client can’t tell what was checked.
  • Failed sequence not reproduced as a unit test before fixing. Loses determinism during iteration.
  • Cached DOMAIN_SEPARATOR / oracle price in the harness — caches drift from contract under test.
  • Mocks too perfect. Mock ERC-20 returns true even for failing transfers — masks SafeERC20 issues. Cross-ref Tuan-07-Token-Standards-Integration-Risk weird ERC-20 matrix.

13. Trade-offs

DecisionOption AOption BAuditor’s view
Engine choice (single-engine project)Foundry invariant onlyFoundry + Echidna + MedusaAlways run ≥2 engines on critical code. Foundry-only is acceptable for low-criticality.
Test modeProperty onlyAssertion onlyBoth. Property for end-of-sequence invariants; assertion for per-call relationships.
Handler granularityOne handler per protocol contractOne handler per logical user flowPer logical flow ages better; refactor-friendly.
Input controlvm.assumeboundbound 95% of the time.
Corpus persistenceFresh each runPersistent + committedPersistent for any engagement >1 day. Commit.
fail_on_reverttrue alwaysfalse during development, true in CIStart false while bounding the handler; flip to true once handler reverts are explainable.
Where invariants liveInside the test contractIn a separate Properties.sol shared with Echidna/MedusaSeparate file. Single source of truth across engines.
Run depthShort and fastLong and overnightBoth profiles. Short for CI/dev; long for audit and remediation.
Symbolic complementNoneHalmos on hot library + Certora on critical invariantsHalmos is cheap (free, Foundry-native). Add it. Certora only on top-tier engagements.

14. Quiz (≥80% to advance)

  1. Q: What’s the difference between stateless and stateful fuzzing, and when do you choose each? A: Stateless tests one function call from a fresh state per run; good for pure math / encoders. Stateful runs a sequence of calls against a deployed contract, evaluating invariants across transitions; required for real protocols where bugs depend on the order and accumulation of operations.

  2. Q: A junior auditor’s setUp calls targetContract(address(vault)) and then writes invariant_solvent. The test passes after 50,000 runs. Why is this not strong evidence of solvency? A: Direct-target fuzzing on the vault calls every external function with raw uint256 parameters, most of which revert on input validation. Effective coverage is shallow. A passing invariant only means “in the few non-reverting calls the engine made, solvency held”. Use a handler with bounded inputs and ghost variables.

  3. Q: Why prefer bound(x, lo, hi) over vm.assume(lo <= x && x <= hi)? A: bound maps every fuzz input into the valid range, so every iteration is productive and shrinking/reproduction works deterministically. vm.assume discards out-of-range inputs, wasting iterations and reducing effective sequence depth, especially with tight ranges.

  4. Q: What’s a ghost variable and why is it essential to invariant testing? A: A ghost variable is shadow state maintained in the handler that tracks what an outside accountant would know (cumulative deposits, withdrawals, per-actor flows). It’s the source-of-truth the invariant compares the protocol’s reported state against. Without ghosts you can only check internal consistency (already covered by unit tests); with ghosts you check external consistency.

  5. Q: When would you choose Medusa over Echidna? A: When you need mainnet-fork fuzzing (Medusa is geth-based, first-class fork support), when you have multi-core hardware (Medusa is parallel; Echidna is single-threaded), or when starting greenfield in 2025+ as Medusa is more actively developed. Echidna remains preferred for projects with an existing Echidna corpus or where its specific shrinking/mutation behaviour is needed.

  6. Q: A fuzz suite reports all invariants pass. The invariant_callSummary shows deposit: 4500, redeem: 0, donate: 0, warp: 0. Should you ship? A: No. The fuzzer only exercised deposit. Invariants involving withdrawals, donations, or time-dependent state were never reached, so “pass” is uninformative for those properties. Either the handler is bugged (other functions revert before doing anything observable) or the seed/runs configuration is biased. Fix the distribution before trusting the result.

  7. Q: You plant a rounding-direction bug. Foundry invariant catches it in 30 seconds; Echidna in 90 seconds; Medusa in 10 seconds. What does this tell you about your harness? A: All three engines reach the bug, which means the invariant is correctly stated and the handler exposes the path. The timing differences reflect engine parallelism and coverage-guidance quality. The bug was easy in this harness — for hard bugs you’d see the spread widen and one engine miss it entirely. Document the time-to-detection as evidence the suite has falsifying power.

  8. Q: Why is “no failures after 1M runs” weaker evidence than “Halmos proves no failure exists”? A: Fuzzing samples a tiny subset of the input space. A bug at one specific input out of 2^256 has near-zero probability of being sampled. Symbolic execution (Halmos) reasons about all inputs at once via SMT — if it succeeds, the property holds universally up to the bound on loops and external calls.

  9. Q: A fuzz suite finds a failing sequence: donate(1e18) → deposit(1). How do you confirm whether this is a real bug or an invariant-too-strong? A: Reproduce the sequence as a deterministic Foundry unit test. Compute what the protocol should return per the spec at each step. If the spec allows the resulting state (donation distorts share price; documented), the invariant is too strong — relax it with a tolerance. If the spec says the state shouldn’t occur, it’s a real bug — write the finding.

  10. Q: You’re remediating an audit. The client fixes the bug and asks for sign-off. What’s the minimum fuzzing workflow? A: (1) Take the pre-fix corpus and replay against post-fix code; confirm the original failing sequence now passes. (2) Run a fresh coverage-guided session of equal length to the original audit’s fuzzing budget; confirm no new failures arise. (3) Re-derive and re-run the full invariant catalogue. If all three steps pass cleanly, sign off; if any fail, the fix is incomplete or introduced a regression — escalate.


15. Deliverables

  • Lab project (vault4626-fuzz/) with the Foundry invariant suite ≥5 invariants, handler with multi-actor + ghosts + time advance.
  • Echidna properties file for the same vault.
  • Medusa config (medusa.json) running the same target.
  • One planted-bug experiment with time-to-detection recorded for all three engines, written up in notes/lab-results.md.
  • Updated personal audit checklist: every anti-pattern in §12 added.
  • Invariant catalogue for the 4626 vault, copy-pasteable for future audits — drop into [[invariant-test-templates]] (the attachment).
  • (Stretch) Medusa fork-mode run against mainnet DAI; lab results showing token-quirk bugs surfaced (if any).
  • (Stretch) Halmos symbolic check on convertToShares and convertToAssets for monotonicity / no-overflow.

16. Where this leads

The natural next stop is Tuan-Bonus-Formal-Verification-Deep — the symbolic and SMT-based tools that pick up where fuzzing’s sampling ends. Halmos especially overlaps with the Foundry workflow you built here; the same handler can drive symbolic exploration of pure-math hotspots.

For applied use of these techniques in real engagements, return to Tuan-15-Audit-Methodology-Tooling §9 and §10, where this material is the engine-layer detail behind the time-share allocations.

For deeper case studies of bugs that invariant testing would have caught had it been deployed, study Case-Cream-Iron-Bank-2021 (ERC-777 callback invariants), Case-Euler-Finance-2023 (donation-attack-on-internal-accounting; near-textbook 4626 inflation analog), and Case-Penpie-Pendle-2024 (reward-accounting invariant). Every one of these had an invariant that, if encoded and fuzzed pre-launch, would have failed.

The deeper philosophical lesson — same as the rest of the auditing craft: the engine is not the work. The invariants are the work. The handler is the work. The triage is the work. The engine is the tool that makes the work scalable.


Last updated: 2026-05-16 See also: Tuan-15-Audit-Methodology-Tooling · Tuan-Bonus-Formal-Verification-Deep · invariant-test-templates · lab-dev-environment · foundry-cheatsheet · MOC-Web3-Security-Mastery