Bonus — Formal Verification for Smart Contracts (Deep Dive)

“Fuzzing finds the bugs you can stumble onto. Formal verification proves the bugs you cannot. When the invariant must hold for the trillion-dollar accounting line, ‘we ran ten million random inputs and nothing broke’ is not an answer — it is a literal probability statement, and a sophisticated attacker is not a random input. The auditor’s job, at the top of the craft, is to translate the protocol’s promise into a mathematical statement and force a solver to either prove it or hand you the witness.”

Tags: web3-security formal-verification certora halmos cvl symbolic invariant methodology tool Learner: Comfortable with Tuan-05-Vulnerability-Classes-Part-1 and the invariant material in Tuan-15-Audit-Methodology-Tooling; ready to graduate from “I ran Echidna” to “I proved the property” Time: 5–8 days self-paced (treat as deep bonus; a single audit using these tools well can save a client eight figures) Related: Tuan-15-Audit-Methodology-Tooling · Tuan-Bonus-Fuzzing-Invariant-Advanced · Tuan-Bonus-ZK-Circuit-Security · invariant-test-templates · Tuan-04-Security-Foundations-CEI-AC

1. Context & Why

1.1 The auditor’s question this chapter answers

You’ve done Week 15. You can write invariants, harness Foundry’s invariant runner, run Echidna and Medusa overnight, and produce a report. You are still missing one tool: proof.

Fuzzing demonstrates a bug, but it never demonstrates absence of bugs. After 10M Echidna runs without a counterexample you have evidence that the bug is rare among uniformly-distributed inputs — but an attacker is not uniformly distributed, and a 1-in-2^96 corner case (a specific oracle price, a specific reserve ratio, a specific calldata shape) will never surface in Echidna’s lifetime. The bugs that drained Curve, Cream, Euler, Mango Markets all had counterexamples that fuzzing would have hit only if it had been told exactly where to look.

Formal verification (FV) is the discipline of writing a mathematical statement of what the contract must do and then asking a solver — Z3, CVC5, Yices — to either prove the statement holds across all possible executions or return a concrete counterexample. There is no “we tested it enough”; there is only “proven” or “here is the bug, with exact input values”.

This chapter is a deep dive on the toolchain — Certora, Halmos, hevm, K/KEVM — and, more importantly, on the auditor’s workflow around them: when to reach for which tool, how to write specs that actually prove things, how to recognize when the spec is wrong (the most common failure mode, by a wide margin), and how to integrate FV into a normal audit timeline.

1.2 What you’ll be able to do at the end

Explain the difference between bounded model checking, symbolic execution, and theorem proving — in one sentence each, with the tool you’d use for each.
Recognize the three classes of property where FV is the right tool and the three where it’s the wrong tool.
Write a basic Certora CVL spec for an ERC-4626 vault: methods block, invariant, parametric rule, ghost variable with hook.
Write a Halmos check_* test that proves an access-control matrix.
Read a Certora counterexample and trace it back to the violating line of Solidity.
Articulate the cost/benefit case to a client deciding whether to add FV to an engagement.

1.3 Primary references

Source	URL	Notes
Certora Prover docs (CVL)	https://docs.certora.com/	Canonical CVL reference. CVL2 is the current language version. [verify version on docs site]
Certora Examples repo	https://github.com/Certora/Examples	Working specs for ERC20, ERC4626, Aave, Compound, MakerDAO patterns. Start here for templates.
Halmos	https://github.com/a16z/halmos	a16z’s BMC tool. v0.3.x as of 2025; check `halmos --version`. [verify]
Halmos cheatcodes (svm)	https://github.com/a16z/halmos-cheatcodes	Symbolic primitives — `svm.createUint(...)`, `svm.createBytes(...)`.
hevm	https://github.com/ethereum/hevm	Symbolic engine, Foundry-compatible, Haskell. v0.57.x as of early 2026. [verify]
KEVM (Runtime Verification)	https://github.com/runtimeverification/evm-semantics	The EVM, in K. Foundation for Kontrol.
Kontrol (RV Foundry plugin)	https://github.com/runtimeverification/kontrol	KEVM-backed Foundry verification (commercial).
K framework	https://kframework.org/	Operational semantics framework.
Trail of Bits — Manticore (historical)	https://github.com/trailofbits/manticore	Older symbolic tool; mostly superseded by Halmos/hevm for EVM.
ConsenSys — Mythril (historical)	https://github.com/Consensys/mythril	Older symbolic executor. Still occasionally useful for bytecode-only analysis.
a16z blog — Halmos intro	https://a16zcrypto.com/ (search “halmos”)	Motivation and walkthrough.
Certora blog	https://www.certora.com/blog	Real-bug write-ups: Silo, Texture, infiniFi, Kamino, Aave V4, Lido dual-governance, Uniswap V4.
Trail of Bits — “Building Secure Contracts”	https://github.com/crytic/building-secure-contracts	Has a “Properties” section with manual invariant patterns.
Underhanded Solidity Contest writeups	https://github.com/Underhanded-Solidity-Contest	Curated bugs that look benign and are hard to formalize — best test bed for “can my spec catch this?“.

2. What Formal Verification Is (and Isn’t)

2.1 The mathematical claim

Formal verification proves a statement of the form:

For all inputs x satisfying precondition P(x), the system’s behavior S(x) satisfies postcondition Q(x, S(x)).

Three things to internalise:

“For all” is literal. Not “for the 10M inputs Echidna sampled”. Not “for the inputs the human auditor thought of”. For every input in the symbolic input space — typically uint256, the full 2^256 set, or bytes calldata of every length up to a bound.
The system is the bytecode, not the source. The prover compiles your Solidity to EVM bytecode (or works on a bytecode trace) and reasons about the actual semantics. This matters: Vyper compiler bugs, Solidity optimizer artifacts, ABI-decoding quirks — all visible.
Either the prover returns “proved” or it returns a counterexample. A counterexample is the auditor’s gold: a specific assignment of inputs that violates the postcondition. Translate it back to a PoC, file a finding.

The three sub-disciplines:

Discipline	What it does	Best tool for EVM	Soundness scope
Bounded Model Checking (BMC)	Unrolls execution to depth N; checks property over all paths of length ≤ N	Halmos, hevm	Sound up to bound N. Anything beyond N is unchecked.
Symbolic execution	Explores all symbolic paths; uses SMT to decide path conditions	Halmos, hevm, Manticore (legacy)	Sound when termination is reached; otherwise depends on bounds.
Inductive theorem proving	Proves `P` holds in initial state and is preserved by every transition	Certora (auto-induction via SMT), K/KEVM (interactive)	Sound for unbounded executions when induction succeeds.

2.2 What FV is not

A surprising fraction of audit clients (and not-so-junior auditors) confuse FV with adjacent practices. Be precise.

FV is not bug-finding by exhaustive testing. That’s fuzzing. Fuzzing samples; FV proves over all of input space.
FV is not “running a verifier”. It’s writing specifications. The verifier is the easy part. The spec — the precise statement of what correct means — is where 80% of the value and 100% of the difficulty lives.
FV is not type-checking on steroids. The Solidity compiler proves syntactic things; FV proves semantic properties of a specific contract.
FV is not Slither + magic. Slither is pattern matching on a syntactic representation. It’s pure heuristics. FV reasons about state and inputs.
FV does not prove your protocol is secure. It proves the properties you wrote down hold. If the property is wrong, weak, or missing, FV says nothing about that gap. The wrong-spec failure mode is the dominant one in practice.

2.3 The “what does this prove?” table

When reading a Certora or Halmos engagement report, the question that separates the senior from the junior is: “what does this prove, and what does this not prove?”

Claim from a spec	What it proves	What it does NOT prove
`assert balanceOf(user) <= totalSupply()` after every function	Balances stay ≤ supply across all single-tx executions	Multi-tx atomic-bundle attacks (flash-loan composition), externally-imposed states
`invariant totalSupply == sum(balances)`	The classic accounting equation holds in all reachable states	That total supply itself is correct (mints could be unauthorized but still preserve sum)
`rule onlyOwnerCanPause` — `pause()` reverts unless `caller == owner`	The auth check is enforced for `pause()`	That owner is correctly set, that owner can’t be hijacked via other paths
`rule deposit increases shares`	`deposit()` strictly increases the receiver’s share balance	That the amount of shares is correctly priced

Auditor’s reflex: every CVL rule should be paired with a one-line “and this would still allow the following exploit” note. If you can’t think of any, your spec is probably too tight (vacuously true) or too loose (you missed a property).

3. When FV Wins Over Fuzzing

Fuzzing and FV are not substitutes; they are complements. They cover different parts of the property space.

3.1 The four regimes where FV wins clearly

3.1.1 Tight invariants that must hold over all paths

“The vault is always solvent: totalAssets() >= totalLiabilities().”

Fuzzing can show this fails in some trace. It cannot show it never fails. If the property is tight — the bound is sharp, and there are many edge inputs that approach it — fuzzing will hit the bound but rarely cross it. FV proves the bound holds or finds the exact path that crosses it.

Example — an ERC-4626 vault’s convertToAssets(shares) must round down in favor of the vault. The bug class is a single division that rounds up in a corner. Fuzzing samples one rounding direction; FV proves it for all share/asset ratios.

3.1.2 Counterexamples are statistically sparse

If the bug requires a specific combination — e.g., (reserveA, reserveB, swapIn) such that an intermediate computation underflows by exactly 1 wei — random fuzzing has a probability 1/2^k of hitting it, where k is the number of constrained bits. For k = 96, you’d need 2^96 ≈ 10^29 runs to hit it once on average. Echidna runs perhaps 10^7 cases per hour. Fuzzing will never see this bug.

Examples:

Kyber Elastic (2023): swap math had a corner involving tick-crossing with a specific liquidity delta. Probability under random fuzz: vanishing.
Curve virtual_price interim-state read: requires entering during a specific re-entrant moment.
Compound COMP distribution (2021): bug triggered by specific index increment overflow.

FV reasons about symbolic values, so the corner is one path among finitely many — not a needle in a haystack.

3.1.3 High-value invariants where “almost always holds” is worthless

Solvency. Accounting conservation. Access control. These either hold or the protocol is dead. “Holds with 99.9999% probability under fuzzing” is meaningless when the protocol holds $1 B an d a s in g l ee x pl o i t d r ain s$ 300M.

For these, the FV cost is paid back the first time an auditor surfaces a Certora reports VIOLATED and ships a counterexample. That has happened repeatedly at Aave, Maker, Lido, Compound. [verify with current Certora blog write-ups]

3.1.4 Regression-safe re-engagement

A protocol fixes a bug. The fix is one line. How do you know the fix didn’t create a new bug, or fail to address a sibling bug?

Unit test: only checks the specific input that was buggy. Sibling cases unverified.
Fuzz: random sampling around the fix. Maybe hits, maybe not.
FV: re-run the spec. If “no unauthorized mint” was proven before and after the fix, you have a mathematical guarantee that the fix preserves the invariant.

This is why Certora is heavily used for upgrades: the spec is a contract between the team and the verifier. Each new version must keep passing.

3.2 The fuzzing-still-wins regimes (don’t oversell FV)

Regime	Why fuzzing wins
Exploratory unknown-unknowns	You don’t know what the invariant is yet. Fuzzing surfaces unexpected reverts and divergences that suggest new invariants to write.
Loop-heavy iterative code	FV must unroll loops. Bound = N means anything ≥ N is unverified. Fuzzing happily runs the loop.
External-call-heavy contracts	FV must abstract or summarise external code (e.g., `DISPATCHER`); abstractions weaken the proof. Fuzzing forks mainnet and uses the real callee.
Time / block-dependent logic	FV reasoning about `block.timestamp` across many txs requires careful environment modeling; fuzzing just bumps time.
Throughput	A modest fuzz harness does `10^6` runs/min; Certora may take 30 min to prove one rule. For an iterating dev loop, fuzz is faster.

3.3 The pipeline (the right way to use both)

flowchart LR
    A[Invariant list<br>from §6 of Tuan-15] --> B[Unit test<br>concrete cases]
    B --> C[Foundry invariant<br>handler-based fuzzing]
    C --> D[Echidna / Medusa<br>overnight fuzz runs]
    D --> E[Halmos check_*<br>BMC on hotspots]
    E --> F[Certora CVL<br>protocol-wide invariants]
    F --> G[Regression<br>re-run on every PR]
    style A fill:#e6f3ff
    style F fill:#c2f0c2
    style G fill:#c2f0c2

Cheap → expensive, left → right. Don’t skip ahead. Fuzz finds counterexamples in seconds; surface them, fix them, then try to formalize.

4. Certora Deep — CVL, Rules, Hooks, Ghosts

Certora is the commercial standard. Its sweet spot is inductive invariants over unbounded executions — properties that must hold across any sequence of any functions in any order. That coverage is impossible with BMC tools like Halmos.

4.1 The toolchain shape

contract.sol  ────►  Solidity compiler  ────►  Bytecode + ABI
spec.spec     ────►  certoraRun CLI    ────►  upload to cloud
                                                 │
                                                 ▼
                                          SMT solver (Z3/CVC5)
                                                 │
                          ┌──────────────────────┼──────────────────────┐
                          ▼                      ▼                      ▼
                   PROVED rule         VIOLATED rule (CEX)       TIMEOUT
                                       call trace + storage      tune spec or
                                       state + msg.sender        increase budget

The Certora cloud runs the actual prover. The local CLI uploads bytecode + spec and downloads results. Free tier exists with limited monthly compute; commercial tiers scale.

4.2 CVL anatomy — the methods block

Every CVL spec starts by declaring which Solidity functions it can reason about and how to model each.

methods {
    // Pure-ish view function with no environment dependency
    function totalSupply() external returns (uint256) envfree;
    function totalAssets() external returns (uint256) envfree;
    function balanceOf(address) external returns (uint256) envfree;
 
    // State-changing functions: need an env (block context)
    function deposit(uint256, address) external returns (uint256);
    function redeem(uint256, address, address) external returns (uint256);
 
    // External-contract calls: summarise behaviour
    function _.transferFrom(address, address, uint256) external => DISPATCHER(true);
    function _.balanceOf(address) external => DISPATCHER(true);
 
    // For un-summarisable external code, abstract as nondeterministic
    function someOracle.price() external returns (uint256) => NONDET;
}

envfree is a load-bearing keyword. It says: “this function’s return value does not depend on msg.sender, block.timestamp, etc.” If the prover proves this, calls to the function in rules don’t need an env argument — your specs get much cleaner. If it can’t prove envfree, your declaration fails and you must pass env everywhere.

Summaries for external functions:

Summary	Semantics	When to use
`DISPATCHER(true)`	When the contract’s address matches a known contract, dispatch to its code; otherwise nondeterminstic	Most ERC-20 calls — the test harness provides concrete implementations
`DISPATCHER(false)`	Same, but if no match, havoc state	Default for cautious modelling
`NONDET`	The function returns an arbitrary value; no state changes outside this contract	When external is fully opaque (oracle, off-chain)
`ALWAYS(<expr>)`	The function returns the constant expr	When you know the return is fixed for the proof
`AUTO`	Default — Certora picks based on context	Use only when you’ve checked it picked sensibly
Function summary `=> myCVLFunction(args)`	Replace with CVL-defined helper	Heavy customisation

4.3 Writing a rule — parametric vs concrete

Concrete rule — one named function, one property:

rule depositMintsShares(uint256 assets, address receiver) {
    env e;
 
    // Preconditions
    require receiver != 0;
    require receiver != currentContract;
    require assets > 0;
 
    uint256 sharesBefore = balanceOf(receiver);
 
    uint256 mintedShares = deposit(e, assets, receiver);
 
    uint256 sharesAfter = balanceOf(receiver);
 
    assert sharesAfter == sharesBefore + mintedShares,
        "receiver's share balance must increase by minted amount";
    assert mintedShares > 0,
        "non-zero deposit must mint non-zero shares";
}

Parametric rule — quantified over all methods:

// Prove that NO function ever increases totalSupply except the authorised mint paths
rule onlyAuthorisedMint(method f) filtered {
    f -> !f.isView && f.selector != sig:mint(address,uint256).selector
       && f.selector != sig:deposit(uint256,address).selector
} {
    env e;
    calldataarg args;
 
    uint256 supplyBefore = totalSupply();
    f(e, args);
    uint256 supplyAfter = totalSupply();
 
    assert supplyAfter <= supplyBefore,
        "non-mint function increased totalSupply — unauthorised mint?";
}

The method f plus the filter block tell Certora to re-prove this rule once per function in the contract (except the two explicitly authorised ones). One spec, full coverage. This is the killer feature: you specify the property once, and the prover sweeps every entry point. Most reentrancy and access-control bugs surface this way: some obscure function turns out to mutate state that “shouldn’t” be mutated, and the parametric rule catches it.

4.4 Invariants — inductive properties

A CVL invariant is a predicate that must hold in every reachable state. Certora proves it by induction:

Base case: the predicate holds in the initial state (after constructor).
Inductive step: assuming it holds before a function call, prove it still holds after.

invariant totalSupplyEqualsSumOfBalances()
    totalSupply() == sumOfBalances();

where sumOfBalances is a ghost (next section). Certora attempts to discharge the inductive step automatically for every function. If a function violates it, you get a counterexample — that function is the bug, even if it’s not the function the developer “thought” was about supply.

Preserved blocks — when the prover can’t prove the inductive step alone, you add hints:

invariant totalSupplyEqualsSumOfBalances()
    totalSupply() == sumOfBalances()
    {
        preserved transfer(address to, uint256 amount) with (env e) {
            require e.msg.sender != to;  // self-transfer corner ruled out
            require balanceOf(e.msg.sender) >= amount;
        }
        preserved {
            require totalSupply() < 2^128;  // global bound to avoid SMT timeout
        }
    }

preserved clauses add additional preconditions to the inductive step per method (or unrestricted). They are the auditor’s escape hatch when the prover times out — you trim the input space to what’s actually relevant.

Caveat: every require in a preserved block is a trust assumption. The invariant is “proved given these assumptions”. A senior auditor flags any preserved-block requirement that looks suspicious — it can be hiding the bug.

4.5 Ghost variables and hooks — shadow state

Ghosts are CVL-only variables maintained alongside contract storage. They model abstract quantities the contract doesn’t directly track.

// Ghost: total amount ever deposited (across all users, all time)
ghost mathint totalDepositedEver {
    init_state axiom totalDepositedEver == 0;
}
 
// Hook: every time a user's deposit balance increases, bump the ghost
hook Sstore deposits[KEY address user] uint256 newVal (uint256 oldVal) {
    totalDepositedEver = totalDepositedEver + newVal - oldVal;
}
 
// Rule using the ghost as an accounting check
invariant solvency()
    to_mathint(totalAssets()) >= totalDepositedEver - totalWithdrawnEver;

mathint = unbounded mathematical integer (no overflow). Use this for ghosts; only convert to/from uint256 when interacting with the contract.

Hooks instrument the EVM-level operations:

Sstore — fires on every storage write to the named slot/key.
Sload — fires on every read.
CALL, STATICCALL, DELEGATECALL — fires on external calls.

The hook runs inside the prover’s symbolic simulation, not at run time. There’s no gas cost. Ghosts are pure spec.

Pattern: sum-of-mapping ghost (the workhorse for ERC-20-style invariants):

ghost mathint sumOfBalances {
    init_state axiom sumOfBalances == 0;
}
 
hook Sstore _balances[KEY address user] uint256 newBal (uint256 oldBal) {
    sumOfBalances = sumOfBalances + newBal - oldBal;
}
 
// Now the totalSupply invariant is:
invariant totalSupplyEqualsSumOfBalances()
    to_mathint(totalSupply()) == sumOfBalances;

Once you have this ghost, you can prove conservation across every function in 5 lines of CVL.

4.6 Common rule patterns — the auditor’s reusable library

4.6.1 Monotonicity

“Once paused, the contract never unpauses without onlyAdmin re-entering.” “User’s nonce is monotonically increasing.”

rule nonceMonotone(address user, method f) filtered { f -> !f.isView } {
    env e;
    calldataarg args;
    uint256 nonceBefore = nonces(user);
    f(e, args);
    uint256 nonceAfter = nonces(user);
    assert nonceAfter >= nonceBefore;
}

4.6.2 No-magic-mint / no-unauthorised-state-change

“totalSupply never increases except via mint called by MINTER_ROLE.”

rule noUnauthorisedMint(method f) filtered { f -> !f.isView } {
    env e;
    calldataarg args;
 
    uint256 supplyBefore = totalSupply();
    f(e, args);
    uint256 supplyAfter = totalSupply();
 
    assert supplyAfter > supplyBefore =>
        (f.selector == sig:mint(address,uint256).selector &&
         hasRole(MINTER_ROLE(), e.msg.sender));
}

4.6.3 Conservation of value

“After any swap, tokenA.reserve * tokenB.reserve >= k_before (constant-product invariant, with fee adjustment).”

This is the AMM solvency rule. Direct CVL translation, modulo fee-math precision.

4.6.4 Role transitions

“Only the current admin can grant DEFAULT_ADMIN_ROLE to a new account.”

rule onlyAdminGrantsAdmin(address account, method f, env e) {
    bool wasAdmin = hasRole(DEFAULT_ADMIN_ROLE(), account);
    calldataarg args;
    f(e, args);
    bool isAdminNow = hasRole(DEFAULT_ADMIN_ROLE(), account);
    assert (!wasAdmin && isAdminNow) =>
        hasRole(DEFAULT_ADMIN_ROLE(), e.msg.sender);
}

4.6.5 Function-pair inverse

“deposit followed by redeem of the resulting shares returns the user to their starting position (within rounding).”

rule depositRedeemInverse(uint256 assets, address user) {
    env e;
    require user != currentContract && user != 0;
    require assets > 0;
 
    uint256 startBal = underlying.balanceOf(user);
    uint256 shares = deposit(e, assets, user);
    uint256 redeemed = redeem(e, shares, user, user);
    uint256 endBal = underlying.balanceOf(user);
 
    // Allow 1-wei rounding tolerance (always in favour of vault)
    assert endBal <= startBal;
    assert endBal + 1 >= startBal - 1;
}

This is the rule that catches inflation attacks, donation bugs (Euler 2023’s class), and rounding-direction errors.

4.7 Sanity checks (built-in)

Certora ships built-in sanity rules to catch the most embarrassing class of false-positive: vacuous specs. A vacuous rule is one whose preconditions are unsatisfiable — Certora “proves” it because no input even reaches the assertion.

// VACUOUS — preconditions contradict each other
rule oops(uint256 x) {
    require x > 100;
    require x < 50;
    assert someFunction(x) == 0;  // proved trivially
}

Run with --rule_sanity basic (or advanced) to surface these. Every Certora engagement should turn sanity on. A “passed” rule that’s vacuous is worse than a failing rule — it gives false security.

4.8 CVL — limitations and gotchas

Limitation	Mitigation
SMT timeout on complex specs	Tighten preserved blocks; bound `mathint`; split rule into smaller rules; use `--smt_timeout` with care
Unrolling loops	Set `--loop_iter N` for BMC of bounded loops; for unbounded loops, prove an invariant about the loop body instead
External calls to unknown code	Use `DISPATCHER(true)` with concrete mocks; or `NONDET` and accept the abstraction
`block.timestamp` reasoning across many txs	Add explicit `env` constraints on time progression
Storage layout assumptions	Hooks reference slots by Solidity name (`balances[KEY ...]`), but be aware that proxy upgrades may shift layout — re-run on each version
Floating-point / fixed-point math	Certora prefers integer math; `mathint` covers most of it. Fixed-point bugs sometimes need careful normalisation in the spec.
Cost	Free tier exists with monthly compute quota; commercial pricing per protocol. Plan for 1-2 weeks of CVL ramp-up before productivity.

4.9 Reading a counterexample

A Certora “VIOLATED” report comes with:

Call trace — the exact sequence of function calls.
Storage state — values of every read storage slot at each step.
msg.sender / msg.value / block.timestamp — the env at each call.
The assertion that failed — your own assert.

The auditor’s job: translate this to a concrete PoC. Usually forge test with the exact inputs Certora chose, ideally reproduced in a test that fails before the fix and passes after.

Failure mode: the counterexample is in a corner that the developer dismisses as “unreachable in practice”. Sometimes they’re right (the corner requires a state that, by construction, never occurs because of an external invariant). Sometimes they’re wrong, and the auditor’s job is to demonstrate, with a mainnet-fork PoC, that the state is reachable. Never accept “this can’t happen” without a written argument that itself becomes a trust assumption in the audit report.

4.10 Real audits using Certora — recent track record

[verify all of these with current Certora blog]

Aave V4 — multi-month engagement, deep AMM and lending invariants.
Compound — CTKO accounting, distribution math.
MakerDAO / Dai — global solvency invariants; PSM-specific rules.
Lido — dual-governance system; staking-router invariants. Found governance bugs pre-mainnet.
Uniswap V4 — hook composability; found vulnerabilities involving malicious hooks that fuzzing missed.
Morpho — lending vault invariants.
Silo / Texture / infiniFi / Kamino — multiple bugs caught pre-deployment per Certora blog write-ups.

Pattern: the higher the TVL and the more complex the accounting, the more lopsided the cost/benefit toward FV.

5. Halmos Deep — BMC Against Foundry Tests

Halmos (from a16z) is the open-source counterpart to Certora for a large class of single-tx properties. It reads Foundry-style tests, replaces concrete inputs with symbolic ones, and uses an SMT solver to explore all paths.

5.1 Installation and first run

# Recommended: uv (Astral) for Python tooling
curl -LsSf https://astral.sh/uv/install.sh | sh
uv tool install --python 3.12 halmos
 
# Verify
halmos --version
# halmos 0.3.x  [verify]

# In a Foundry project
cd my-foundry-project
forge build
halmos

By convention, tests Halmos runs are prefixed check_* (not test_*). This separates pure Foundry tests from symbolic ones; both can co-exist.

5.2 The `check_*` pattern

// test/halmos/CheckVault.t.sol
// SPDX-License-Identifier: MIT
pragma solidity ^0.8.20;
 
import "forge-std/Test.sol";
import "halmos-cheatcodes/SymTest.sol";
import "../src/Vault.sol";
 
contract CheckVault is Test, SymTest {
    Vault vault;
 
    function setUp() public {
        vault = new Vault();
    }
 
    /// @dev Property: balanceOf(user) is always ≤ totalSupply()
    function check_balanceWithinSupply(address user) public {
        // Symbolic precondition: avoid trivial reverts
        vm.assume(user != address(0));
 
        // The property
        assert(vault.balanceOf(user) <= vault.totalSupply());
    }
 
    /// @dev Property: deposit followed by withdraw is value-preserving
    function check_depositWithdrawRoundtrip(uint256 amount, address user) public {
        vm.assume(amount > 0 && amount < 2**128);
        vm.assume(user != address(0) && user != address(vault));
 
        uint256 before = vault.balanceOf(user);
        vm.prank(user);
        vault.deposit(amount);
        vm.prank(user);
        vault.withdraw(amount);
        uint256 afterBal = vault.balanceOf(user);
 
        assert(afterBal == before);
    }
}

Run:

halmos --function check_balanceWithinSupply --loop 4
halmos --function check_                        # all check_* functions

5.3 Symbolic primitives — `svm` cheatcodes

Halmos provides halmos-cheatcodes/SymTest.sol with helpers:

import "halmos-cheatcodes/SymTest.sol";
 
contract CheckSym is Test, SymTest {
    function check_thing() public {
        uint256 x        = svm.createUint256("x");          // fully symbolic uint256
        bytes memory data = svm.createBytes(64, "data");    // 64-byte symbolic bytes
        address addr     = svm.createAddress("addr");        // symbolic address
        bool flag        = svm.createBool("flag");
 
        // Use as ordinary values; Halmos tracks them symbolically
        if (flag && x > 100) {
            target.foo{value: 0}(addr, data, x);
        }
        assert(/* something */);
    }
}

svm.createXXX are the load-bearing primitives. Function parameters are also symbolic by default — for many properties, that’s enough.

5.4 Loop unrolling and bounds

Halmos is bounded — it doesn’t reason about unbounded loops automatically. Configure:

halmos --loop 8       # unroll loops up to depth 8
halmos --array-lengths 0,1,2,32   # try arrays of these lengths
halmos --solver-timeout-assertion 60000  # 60s per assertion

If your code has a loop with N iterations and you set --loop 4, Halmos verifies paths with ≤ 4 iterations and silently skips longer ones. This is the most common source of false-pass with Halmos. Read the output: it lists paths cut off by the bound.

5.5 Halmos vs Certora — what to use when

Property type	Halmos	Certora
Single-tx pre/post over symbolic inputs	Excellent (free, fast iteration)	Overkill
Pure math library (TickMath, FullMath, ABDK)	Excellent	Excellent
Access control matrix	Excellent	Excellent
Multi-tx invariants (state machine)	Limited (need to manually compose)	Native (invariants are first-class)
Inductive invariants over unbounded executions	No	Yes
Reasoning about ghost / shadow state	No	Yes (ghosts + hooks)
Loop verification beyond a depth	No (bounded only)	Limited (needs loop invariants)
Cost	Free	Commercial / limited free tier

Rule of thumb: Halmos for tight, single-function properties you can express as assert(<property>) after a symbolic call. Certora for the global accounting properties that hold across any sequence of any function calls.

5.6 Halmos limitations

Limitation	Mitigation
Loop bound	Set explicit `--loop N`; never trust the default for production specs
External calls to unknown code	Mock the callee with a concrete contract; or use `vm.mockCall`
Storage state explosion	Reduce state via `setUp` — start from a constrained state
*`block.` symbolic**	Constrain `vm.warp` / `vm.roll` to specific values where appropriate
Slow solver on rich symbolic arrays	Use `--array-lengths 0,1,2` to cover the relevant cases

6. K Framework, KEVM, Kontrol

The K framework is a meta-tool for defining the operational semantics of any programming language. Runtime Verification (RV) has used K to define the EVM itself (KEVM) and to verify properties of contracts against the full EVM semantics — every opcode, every gas cost, every edge case.

6.1 What K gives you that Certora doesn’t

Certora abstracts the EVM via Solidity-level semantics. That works for 99% of contracts but fails for:

Compiler-bug-resistant proofs — proving a property against the actual bytecode independent of any compiler.
EVM specification proofs — proving properties of EVM precompiles, EIPs, opcodes.
Cross-language verification — Vyper, Huff, Yul all compile to EVM; K reasons at the EVM layer.

KEVM has been used to formally verify:

Casper FFG (Ethereum PoS finality) — pre-merge spec verification.
ERC-20 and ERC-721 reference implementations — KEVM-verified token semantics.
EIP-7212 (P-256 precompile) — verified semantics. [verify]
Beacon Chain components — selected pieces. [verify]
Uniswap V1 math — early proof, historical interest.
DEX, MakerDAO components — RV engagement work.

6.2 Kontrol — KEVM meets Foundry

For application auditors, raw K is overkill. RV’s Kontrol wraps KEVM in a Foundry-compatible interface, similar to Halmos but with K’s deeper-semantic backend.

# Sketch — verify by reading current Kontrol docs
kontrol build
kontrol prove --match-test test_invariant

Kontrol’s advantage: inductive proofs over unbounded loops (Halmos can’t), with a more rigorous EVM model than Certora’s. Disadvantage: steeper learning curve and commercial pricing.

When to consider Kontrol over Certora:

The protocol relies on subtle EVM semantics (gas accounting in payment systems, precise revert behaviour, precompile-driven code).
You need a proof against bytecode independent of the source language.
You’re verifying an EIP or a protocol-layer change.

For routine application audits, Certora + Halmos covers >95% of practical needs. KEVM/Kontrol is the specialist tool.

6.3 K framework workflow (for the curious)

K specs are rewrite rules over configurations (state). The auditor writes:

rule <k> X +Int Y => Z </k>
  requires Z ==Int X +Int Y

This is closer to interactive theorem proving than push-button verification. Plan months, not weeks, to become productive.

7. hevm — The Open-Source Symbolic Engine

hevm (originally part of dapptools, now maintained by the Ethereum Foundation’s formal-methods team) is a Haskell implementation of the EVM optimized for symbolic execution and equivalence checking.

7.1 What hevm offers

Symbolic execution of arbitrary EVM bytecode — no source needed.
Foundry-compatible test runner — hevm test runs Forge tests symbolically.
Equivalence checking — prove two bytecodes have identical behaviour for all inputs. Useful for verifying compiler optimisations, refactors, gas-optimised rewrites.

7.2 Where hevm fits

Bytecode-only contracts — when source is unavailable (rare in audits, but happens on forks).
Compiler-output verification — confirm a Yul rewrite is equivalent to original Solidity.
As a backend — Halmos and other tools historically borrowed pieces from hevm/dapptools heritage.

7.3 Practical caveat

hevm is excellent technology with smaller community adoption than Halmos or Certora. Use it when:

You need equivalence checking (no other tool offers this for EVM out-of-the-box).
You’re verifying bytecode without source.
You’re already in the Haskell ecosystem.

Otherwise, Halmos is the smoother on-ramp for symbolic testing.

8. Mythril & Manticore — Historical Context

The 2018-era symbolic execution wave produced:

Mythril (ConsenSys) — bytecode-level symbolic analysis with built-in detectors for common bug classes. Still occasionally used for one-off bytecode review.
Manticore (Trail of Bits) — generic symbolic execution engine, EVM module. Largely superseded by Halmos for EVM use cases.

Auditor’s view: don’t reach for these as a first choice in 2026. Halmos has the better UX, hevm has the better foundation, Certora has the better proof power. Mythril and Manticore are part of the lineage; cite them in the methodology section of an audit report only when there’s a specific reason.

9. The Auditor’s Four-Phase Workflow

The point of this entire chapter is not “learn three tools” but “integrate FV into a normal audit”. The workflow:

9.1 Phase 1 — Identify invariants

This is the work from Week 15 §6. FV cannot tell you what to prove; it can only check what you claim.

Sources, in priority order:

Whitepaper / protocol design doc — explicit invariants (“the system is always solvent”).
NatSpec — function-level invariants the developer documented.
require / revert statements — implicit invariants the developer enforced.
Test suite — assertions are invariants the developer expected.
Your own threat model — “if X stopped being true, who profits?”

Produce a written list:

# Invariants — <Project>
## Global
- G1: totalSupply() == Σ balances[i] for all i
- G2: totalAssets() ≥ totalLiabilities()
- G3: Only DEFAULT_ADMIN_ROLE can change DEFAULT_ADMIN_ROLE

## Per-module: Vault
- V1: shares minted on deposit ≥ shares burned on equivalent redeem (rounding favours vault)
- V2: convertToAssets is monotonic non-decreasing in shares
- V3: redeem reverts when paused
...

Each line is a candidate for a CVL rule, a Halmos check_*, or both.

9.2 Phase 2 — Spec in the right tool

Decision table:

Property shape	Best tool
”Function `f` returns X iff Y, for any input”	Halmos `check_*`
”No function can move state from A to B except via path Z”	Certora parametric rule
”Across any sequence of function calls, P holds”	Certora invariant
”Pure math library never overflows for input in range”	Halmos (cheap) or Certora
”Storage layout consistent across upgrade V1 → V2”	OZ upgrades-plugin static check (cheaper than FV)
“Solvency under flash-loan attack”	Foundry fuzz first, then Certora rule with multi-tx env

9.3 Phase 3 — Tighten the spec until silence

Workflow:

Write the rule. Run the prover.
Counterexample: did you find a bug, or is your spec too loose? Inspect the CEX:
- Real bug: file the finding, fix the code, re-run.
- Spec too loose: tighten preconditions (a require in CVL or vm.assume in Halmos) and re-run.
- Spec too tight (vacuous): a sanity check should catch this; relax.
Timeout: the spec is true but the solver can’t prove it in time. Decompose: split the property, add intermediate lemmas, bound mathints, or add preserved blocks.
Proved: rule passes. Move to the next.

The single most important habit: every time the prover says “proved”, ask yourself: would this still pass if I deleted the function body? If yes, the spec is vacuous. Add a sanity check or a complementary “this function actually does something” rule.

9.4 Phase 4 — Maintain across refactors

Specs are part of the code. They live in certora/ or test/halmos/ and are re-run on every PR. The CI pipeline:

# .github/workflows/fv.yml — conceptual
jobs:
  halmos:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: forge build
      - run: halmos --function check_ --loop 8
  certora:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: pip install certora-cli
      - run: certoraRun certora/run.conf

A PR that breaks a CVL rule should fail CI as loudly as a broken unit test. Specs are how the protocol team and the auditor talk to each other across upgrades.

10. Lab

10.1 Setup

mkdir -p ~/web3-sec-lab/bonus-fv && cd ~/web3-sec-lab/bonus-fv
forge init --no-commit lending-fv
cd lending-fv
 
# Halmos
curl -LsSf https://astral.sh/uv/install.sh | sh
uv tool install --python 3.12 halmos
halmos --version
 
# Halmos cheatcodes lib
forge install a16z/halmos-cheatcodes --no-commit
 
# Certora CLI (optional; free tier requires account)
pip install certora-cli
solc-select install 0.8.24 && solc-select use 0.8.24

10.2 The target — a small lending contract

// src/MiniLending.sol
// SPDX-License-Identifier: MIT
pragma solidity ^0.8.20;
 
contract MiniLending {
    address public admin;
    mapping(address => uint256) public balanceOf;
    uint256 public totalSupply;
    bool public paused;
 
    modifier onlyAdmin() {
        require(msg.sender == admin, "not admin");
        _;
    }
 
    modifier whenNotPaused() {
        require(!paused, "paused");
        _;
    }
 
    constructor() {
        admin = msg.sender;
    }
 
    function deposit(uint256 amount) external whenNotPaused {
        balanceOf[msg.sender] += amount;
        totalSupply += amount;
    }
 
    function withdraw(uint256 amount) external whenNotPaused {
        require(balanceOf[msg.sender] >= amount, "insufficient");
        balanceOf[msg.sender] -= amount;
        totalSupply -= amount;
    }
 
    function mint(address to, uint256 amount) external onlyAdmin {
        balanceOf[to] += amount;
        totalSupply += amount;
    }
 
    function pause() external onlyAdmin {
        paused = true;
    }
 
    function unpause() external onlyAdmin {
        paused = false;
    }
 
    // The subtle bug we'll plant in Lab 3 lives here.
    function airdrop(address to, uint256 amount) external {
        balanceOf[to] += amount;          // ← OOPS: totalSupply not updated
    }
}

The airdrop is deliberately buggy — it increments a balance without authorisation and without updating totalSupply. Echidna with random inputs will rarely care (no obvious assertion fires). Halmos and Certora will both surface it instantly when given the right property.

10.3 Lab 1 — Halmos `check_*`: balance ≤ totalSupply

// test/halmos/CheckLending.t.sol
// SPDX-License-Identifier: MIT
pragma solidity ^0.8.20;
 
import "forge-std/Test.sol";
import "halmos-cheatcodes/SymTest.sol";
import "../../src/MiniLending.sol";
 
contract CheckLending is Test, SymTest {
    MiniLending lending;
 
    function setUp() public {
        lending = new MiniLending();
    }
 
    /// @notice Property: balanceOf(user) <= totalSupply, after any single
    ///         user-callable function call.
    function check_balanceWithinSupply(
        address caller,
        address user,
        uint256 amount,
        uint8 fnSelector
    ) public {
        vm.assume(caller != address(0));
        vm.assume(user != address(0));
        vm.assume(amount < 2**128);  // avoid trivial overflows
 
        // Start in a state where the invariant holds
        require(lending.balanceOf(user) <= lending.totalSupply());
 
        // Pick a function to invoke
        vm.prank(caller);
        if (fnSelector == 0) {
            try lending.deposit(amount) {} catch {}
        } else if (fnSelector == 1) {
            try lending.withdraw(amount) {} catch {}
        } else if (fnSelector == 2) {
            try lending.airdrop(user, amount) {} catch {}
        }
 
        // Invariant must still hold
        assert(lending.balanceOf(user) <= lending.totalSupply());
    }
}

Run:

halmos --function check_balanceWithinSupply --loop 4 -v

Expected: Halmos returns Counterexample for the airdrop path:

[FAIL] check_balanceWithinSupply(...)
  Counterexample:
    caller = 0x...
    user   = 0x...
    amount = 1
    fnSelector = 2
  Path: airdrop(user, 1) was called; balanceOf[user] became 1; totalSupply stayed 0;
  Assertion: balanceOf(user) <= totalSupply() — VIOLATED (1 <= 0 is false)

Why Echidna might miss this: an Echidna campaign without an explicit property_balanceWithinSupply won’t surface this. With the property, it likely does find it — but only because the bug is one operation away. In a real protocol where airdrop is buried behind a setter / role / time check, the search space explodes and fuzzing misses.

Patch: in airdrop, also do totalSupply += amount;, and add an onlyAdmin modifier. Re-run; Halmos reports proved.

10.4 Lab 2 — Certora CVL: no unauthorised mint

This lab is structured so you can either run it against the Certora cloud (free tier) or document the spec without running.

// certora/MiniLending.spec
 
methods {
    function admin() external returns (address) envfree;
    function balanceOf(address) external returns (uint256) envfree;
    function totalSupply() external returns (uint256) envfree;
    function paused() external returns (bool) envfree;
}
 
// G1: totalSupply equals sum of balances (the workhorse invariant)
ghost mathint sumOfBalances {
    init_state axiom sumOfBalances == 0;
}
 
hook Sstore balanceOf[KEY address user] uint256 newBal (uint256 oldBal) {
    sumOfBalances = sumOfBalances + newBal - oldBal;
}
 
invariant totalSupplyEqualsSumOfBalances()
    to_mathint(totalSupply()) == sumOfBalances;
 
// G2: no function increases totalSupply except mint() called by admin,
// or deposit() called by anyone
rule noUnauthorisedSupplyIncrease(method f) filtered {
    f -> !f.isView
} {
    env e;
    calldataarg args;
 
    uint256 supplyBefore = totalSupply();
    f(e, args);
    uint256 supplyAfter = totalSupply();
 
    assert supplyAfter > supplyBefore => (
        f.selector == sig:deposit(uint256).selector ||
        (f.selector == sig:mint(address,uint256).selector && e.msg.sender == admin())
    ), "supply increased via unauthorised path";
}

Run (if you have certora-cli configured):

certoraRun src/MiniLending.sol \
  --verify MiniLending:certora/MiniLending.spec \
  --solc solc-0.8.24 \
  --rule_sanity basic \
  --rule noUnauthorisedSupplyIncrease

Expected output:

totalSupplyEqualsSumOfBalances — VIOLATED on airdrop (balances change, supply doesn’t).
noUnauthorisedSupplyIncrease — PASSES on the buggy version because airdrop doesn’t touch supply at all, but fails if we “fix” airdrop naively by also bumping totalSupply without auth (then any caller can mint).

Both rules together pin down the bug: airdrop either breaks the sum invariant (current) or breaks the authorisation invariant (naive fix). The correct fix is onlyAdmin on airdrop, treating it as another minting function. The audit finding writes itself.

If you don’t want to run: still document the spec in certora/MiniLending.spec and include it in the audit deliverables. The spec is part of the report; running it is a separate operational step.

10.5 Lab 3 — The bug Echidna missed in N runs

Set up a basic Echidna campaign:

// test/echidna/EchidnaLending.sol
// SPDX-License-Identifier: MIT
pragma solidity ^0.8.20;
 
import "../../src/MiniLending.sol";
 
contract EchidnaLending {
    MiniLending lending;
 
    constructor() {
        lending = new MiniLending();
    }
 
    function echidna_balance_within_supply() public view returns (bool) {
        return lending.balanceOf(address(this)) <= lending.totalSupply();
    }
}

# echidna.yaml
testMode: property
testLimit: 1000000
seqLen: 50

Run:

echidna test/echidna/EchidnaLending.sol --contract EchidnaLending --config echidna.yaml

Critical observation: Echidna’s auto-generated transaction sequences will include airdrop(this, ...) with high probability and will catch this bug — because the contract has only six functions. The bug-by-fuzz / bug-by-FV gap closes when the search space is small.

Now make the bug subtle: wrap airdrop behind a permission gate that’s almost-correct:

mapping(address => bool) public airdropper;
 
function setAirdropper(address who, bool ok) external onlyAdmin {
    airdropper[who] = ok;
}
 
function airdrop(address to, uint256 amount) external {
    require(airdropper[msg.sender] || tx.origin == admin, "not allowed");  // ← bug
    balanceOf[to] += amount;
    totalSupply += amount;
}

The bug: tx.origin == admin is exploitable any time the admin interacts with an attacker contract (classic Week 05 anti-pattern). Plus, the airdropper map starts empty, so the only path is through tx.origin.

Echidna at default config will rarely synthesise the exact sequence “admin calls a contract that re-enters and calls airdrop”. The search space includes choosing msg.sender == admin for the right call out of 2^160 addresses.

Halmos (with caller as a symbolic address) tries all caller values; the tx.origin == admin path is one solver constraint away.

Certora (with a parametric rule “no function makes a non-admin a minter”) flags the tx.origin == admin branch as a path violating the rule.

Demo: run Echidna for 1M sequences — usually no fail unless the admin-EOA is included in targetSenders. Run Halmos / Certora — instant counterexample.

The pedagogical point: FV is not magic, but it eliminates the “I needed to think of the exact sequence” requirement. The auditor states the property, the solver figures out the sequence.

10.6 Lab — Stretch

Add a re-entry path: make withdraw call an external receive hook. Write a Halmos check_no_reentry_can_drain rule and prove it. (Hint: Halmos will need an explicit external mock for the callback.)
Ghost-based solvency: in Certora, add a ghost mathint totalDeposited and totalWithdrawn, then prove totalAssets() >= totalDeposited - totalWithdrawn as an invariant. Add a deliberate accounting bug (forget to decrement on withdraw); confirm Certora catches it.
Reverse engineer: take the Curve virtual_price read-only reentrancy from Tuan-05-Vulnerability-Classes-Part-1 §2.2.4. Write a Halmos rule that captures the property “view returns the same value as a hypothetical fresh state read”. This is hard — by the end, you’ll understand why read-only reentrancy is so widely missed.

11. Anti-patterns (FV-specific, add to the master checklist)

12. Trade-offs

Decision	Option A	Option B	Auditor’s view
First-line FV tool	Halmos	Certora	Halmos for fast iteration; Certora for protocol-wide invariants. Real shops use both.
Spec language proficiency	CVL only	CVL + Halmos + SMT-LIB	Multi-tool is the senior path. CVL alone is fine for years.
Free vs commercial	Free tier + Halmos	Full Certora engagement	For independent auditors, free tier is workable for medium engagements. For top-tier audits (Aave-class), commercial.
Spec depth	One rule per invariant (terse)	Many small rules + sanity (verbose)	Verbose. Each small rule fails for a clear reason; one big rule fails ambiguously.
When to start FV in engagement	Day 1	After fuzz harness exists	After fuzz. Fuzz first surfaces “obvious” bugs at minimum cost; FV is for the rest.
Trusting `preserved` blocks	Liberal	Conservative	Conservative. Every `require` in a preserved block is a trust assumption that must appear in the report.
KEVM / Kontrol	Skip	Adopt	Skip for routine audits. Adopt only when working at protocol/EVM-EIP layer.

13. Quiz (≥80% to claim FV literacy)

Q: What’s the difference between bounded model checking and inductive theorem proving? A: BMC unrolls execution to a fixed depth N and proves the property over all paths of length ≤ N — sound only up to that bound. Inductive theorem proving proves a base case plus inductive step, so the property holds for unbounded execution lengths. Halmos is BMC; Certora invariants are inductive (with SMT-discharged steps).
Q: Why does Certora call a rule “vacuous” and why is that a finding? A: A vacuous rule has preconditions that are unsatisfiable — the assertion is trivially true because no input even reaches it. It’s worse than failing because it gives false confidence. Solution: enable --rule_sanity to flag vacuity.
Q: You write a CVL invariant totalSupply() == sumOfBalances. Certora reports VIOLATED on transfer(a, b, amt). The CEX shows a == b. How do you proceed? A: Self-transfer corner. Either (a) prove the contract guards a != b, or (b) add preserved transfer(address from, address to, uint256 amt) with (env e) { require from != to; } and explicitly note in the audit report that self-transfer is excluded from the proof. Then verify self-transfer behaviour separately.
Q: Halmos reports a check_* rule as PASSED but you suspect a loop bound caused it. What do you do? A: Check the Halmos output for “path truncated due to loop bound” messages. Increase --loop and re-run. If you can’t make the bound large enough due to solver timeout, the rule is only proved up to that depth — flag in the report.
Q: Why is DISPATCHER(true) usually preferred over NONDET for ERC-20 calls in CVL? A: DISPATCHER(true) dispatches to a concrete implementation when one is available in the verification context, allowing the prover to reason about actual ERC-20 semantics (balanceOf, transferFrom). NONDET makes the call return an arbitrary value, which is too weak — most ERC-20-using rules become vacuous or unprovable. Use NONDET only when the external is genuinely opaque (oracles, off-chain).
Q: When would you reach for KEVM / Kontrol over Certora? A: When the property is about EVM semantics rather than Solidity semantics — e.g., verifying a new precompile, an EIP, gas-accounting in a payment protocol, or a property that must hold against the exact bytecode independent of compiler version.
Q: A protocol has a 1000-iteration loop computing fee accruals. Can you Halmos this? A: Not directly — Halmos would need --loop 1000, which likely times out. Strategies: (1) prove an invariant about a single iteration of the loop body (loop-invariant style) and inductively argue the whole loop preserves it; (2) refactor the contract to a closed-form computation; (3) use Certora’s invariant machinery, which handles unbounded iteration via induction.
Q: Echidna runs 10M cases and finds nothing. Should you trust the protocol? A: No. The negative result is evidence of rarity, not absence. If the invariant is high-value, follow up with Halmos or Certora to prove the property over the symbolic space — or accept that you only have probabilistic confidence.
Q: A CVL preserved block contains require msg.sender != address(0). Is that a problem? A: Almost certainly not — msg.sender == 0 is not a reachable EVM state in practice (CALLER opcode never produces zero in normal transactions). Acceptable. The auditor’s reflex remains: every preserved-block constraint must be documented and justified.
Q: A protocol team says “we have 95% line coverage in fuzz; we don’t need FV”. How do you push back? A: Line coverage measures paths exercised, not states explored. A fuzz that hits every line still may never enter the specific state combination that triggers a bug — and the line-coverage metric is silent about it. FV reasons about the state space, which is the right granularity for accounting invariants. Pushback with a concrete invariant the fuzz cannot prove (e.g., “totalSupply == sum of balances after any sequence of N transactions”), and offer to formalise it.

14. Deliverables

Lab 1 (Halmos check_balanceWithinSupply) — buggy version surfaces CEX; patched version proves.
Lab 2 (Certora noUnauthorisedSupplyIncrease) — spec written; run if free-tier configured.
Lab 3 (the Echidna-miss demo) — written walk-through plus the contrasting Halmos / Certora result.
Notes: in your own words, the four cases where FV beats fuzzing and the four where fuzzing wins.
One paragraph: how you would pitch FV to a client deciding between a $50 K a u d i t an d a$ 50K + $40K-FV audit. What’s the case, what’s the deliverable, what’s the maintenance commitment?
Update the master audit checklist with the FV anti-patterns from §11.

15. Where this leads

This bonus chapter slots into your workflow at the top end of the rigor scale. Most audits will still rely on Slither + manual review + Echidna + Foundry-invariant — and rightly so. FV is the tool you reach for when:

The protocol’s TVL is large enough that a single sparse-input bug is unacceptable.
The accounting math is the primary product, not an incidental.
The protocol will undergo many upgrades and regression-safe specs are worth their weight.
The client is willing to fund the engagement.

The complementary chapters are:

Tuan-Bonus-Fuzzing-Invariant-Advanced — the other half of property-based verification; handler-based fuzzing, stateful invariants, Medusa-specific patterns.
Tuan-Bonus-ZK-Circuit-Security — circuit-level formal reasoning; under-constrained-bug detection.
Tuan-15-Audit-Methodology-Tooling — the parent methodology lesson that places FV in the audit pipeline.

Read once for orientation, return when you have a protocol where the math actually has to be right. The first time you ship a Certora VIOLATED finding to a client who would have shipped with a fuzz pass, you’ll understand why the top of the audit market has converged on this toolchain.

Last updated: 2026-05-16 See also: MOC-Web3-Security-Mastery · Tuan-15-Audit-Methodology-Tooling · Tuan-Bonus-Fuzzing-Invariant-Advanced · Tuan-Bonus-ZK-Circuit-Security · invariant-test-templates · References

lthieu's notes

Explorer

Tuan-Bonus-Formal-Verification-Deep

Bonus — Formal Verification for Smart Contracts (Deep Dive)

1. Context & Why

1.1 The auditor’s question this chapter answers

1.2 What you’ll be able to do at the end

1.3 Primary references

2. What Formal Verification Is (and Isn’t)

2.1 The mathematical claim

2.2 What FV is not

2.3 The “what does this prove?” table

3. When FV Wins Over Fuzzing

3.1 The four regimes where FV wins clearly

3.1.1 Tight invariants that must hold over all paths

3.1.2 Counterexamples are statistically sparse

3.1.3 High-value invariants where “almost always holds” is worthless

3.1.4 Regression-safe re-engagement

3.2 The fuzzing-still-wins regimes (don’t oversell FV)

3.3 The pipeline (the right way to use both)

4. Certora Deep — CVL, Rules, Hooks, Ghosts

4.1 The toolchain shape

4.2 CVL anatomy — the methods block

4.3 Writing a rule — parametric vs concrete

4.4 Invariants — inductive properties

4.5 Ghost variables and hooks — shadow state

4.6 Common rule patterns — the auditor’s reusable library

4.6.1 Monotonicity

4.6.2 No-magic-mint / no-unauthorised-state-change

4.6.3 Conservation of value

4.6.4 Role transitions

4.6.5 Function-pair inverse

4.7 Sanity checks (built-in)

4.8 CVL — limitations and gotchas

4.9 Reading a counterexample

4.10 Real audits using Certora — recent track record

5. Halmos Deep — BMC Against Foundry Tests

5.1 Installation and first run

5.2 The check_* pattern

5.3 Symbolic primitives — svm cheatcodes

5.4 Loop unrolling and bounds

5.5 Halmos vs Certora — what to use when

5.6 Halmos limitations

6. K Framework, KEVM, Kontrol

6.1 What K gives you that Certora doesn’t

6.2 Kontrol — KEVM meets Foundry

6.3 K framework workflow (for the curious)

7. hevm — The Open-Source Symbolic Engine

7.1 What hevm offers

7.2 Where hevm fits

7.3 Practical caveat

8. Mythril & Manticore — Historical Context

9. The Auditor’s Four-Phase Workflow

9.1 Phase 1 — Identify invariants

9.2 Phase 2 — Spec in the right tool

9.3 Phase 3 — Tighten the spec until silence

9.4 Phase 4 — Maintain across refactors

10. Lab

10.1 Setup

10.2 The target — a small lending contract

10.3 Lab 1 — Halmos check_*: balance ≤ totalSupply

10.4 Lab 2 — Certora CVL: no unauthorised mint

10.5 Lab 3 — The bug Echidna missed in N runs

10.6 Lab — Stretch

11. Anti-patterns (FV-specific, add to the master checklist)

12. Trade-offs

13. Quiz (≥80% to claim FV literacy)

14. Deliverables

15. Where this leads

Graph View

Table of Contents

Backlinks

5.2 The `check_*` pattern

5.3 Symbolic primitives — `svm` cheatcodes

10.3 Lab 1 — Halmos `check_*`: balance ≤ totalSupply