Bonus — Formal Verification for Smart Contracts (Deep Dive)
“Fuzzing finds the bugs you can stumble onto. Formal verification proves the bugs you cannot. When the invariant must hold for the trillion-dollar accounting line, ‘we ran ten million random inputs and nothing broke’ is not an answer — it is a literal probability statement, and a sophisticated attacker is not a random input. The auditor’s job, at the top of the craft, is to translate the protocol’s promise into a mathematical statement and force a solver to either prove it or hand you the witness.”
Tags: web3-security formal-verification certora halmos cvl symbolic invariant methodology tool Learner: Comfortable with Tuan-05-Vulnerability-Classes-Part-1 and the invariant material in Tuan-15-Audit-Methodology-Tooling; ready to graduate from “I ran Echidna” to “I proved the property” Time: 5–8 days self-paced (treat as deep bonus; a single audit using these tools well can save a client eight figures) Related: Tuan-15-Audit-Methodology-Tooling · Tuan-Bonus-Fuzzing-Invariant-Advanced · Tuan-Bonus-ZK-Circuit-Security · invariant-test-templates · Tuan-04-Security-Foundations-CEI-AC
1. Context & Why
1.1 The auditor’s question this chapter answers
You’ve done Week 15. You can write invariants, harness Foundry’s invariant runner, run Echidna and Medusa overnight, and produce a report. You are still missing one tool: proof.
Fuzzing demonstrates a bug, but it never demonstrates absence of bugs. After 10M Echidna runs without a counterexample you have evidence that the bug is rare among uniformly-distributed inputs — but an attacker is not uniformly distributed, and a 1-in-2^96 corner case (a specific oracle price, a specific reserve ratio, a specific calldata shape) will never surface in Echidna’s lifetime. The bugs that drained Curve, Cream, Euler, Mango Markets all had counterexamples that fuzzing would have hit only if it had been told exactly where to look.
Formal verification (FV) is the discipline of writing a mathematical statement of what the contract must do and then asking a solver — Z3, CVC5, Yices — to either prove the statement holds across all possible executions or return a concrete counterexample. There is no “we tested it enough”; there is only “proven” or “here is the bug, with exact input values”.
This chapter is a deep dive on the toolchain — Certora, Halmos, hevm, K/KEVM — and, more importantly, on the auditor’s workflow around them: when to reach for which tool, how to write specs that actually prove things, how to recognize when the spec is wrong (the most common failure mode, by a wide margin), and how to integrate FV into a normal audit timeline.
1.2 What you’ll be able to do at the end
- Explain the difference between bounded model checking, symbolic execution, and theorem proving — in one sentence each, with the tool you’d use for each.
- Recognize the three classes of property where FV is the right tool and the three where it’s the wrong tool.
- Write a basic Certora CVL spec for an ERC-4626 vault: methods block, invariant, parametric rule, ghost variable with hook.
- Write a Halmos
check_*test that proves an access-control matrix. - Read a Certora counterexample and trace it back to the violating line of Solidity.
- Articulate the cost/benefit case to a client deciding whether to add FV to an engagement.
1.3 Primary references
| Source | URL | Notes |
|---|---|---|
| Certora Prover docs (CVL) | https://docs.certora.com/ | Canonical CVL reference. CVL2 is the current language version. [verify version on docs site] |
| Certora Examples repo | https://github.com/Certora/Examples | Working specs for ERC20, ERC4626, Aave, Compound, MakerDAO patterns. Start here for templates. |
| Halmos | https://github.com/a16z/halmos | a16z’s BMC tool. v0.3.x as of 2025; check halmos --version. [verify] |
| Halmos cheatcodes (svm) | https://github.com/a16z/halmos-cheatcodes | Symbolic primitives — svm.createUint(...), svm.createBytes(...). |
| hevm | https://github.com/ethereum/hevm | Symbolic engine, Foundry-compatible, Haskell. v0.57.x as of early 2026. [verify] |
| KEVM (Runtime Verification) | https://github.com/runtimeverification/evm-semantics | The EVM, in K. Foundation for Kontrol. |
| Kontrol (RV Foundry plugin) | https://github.com/runtimeverification/kontrol | KEVM-backed Foundry verification (commercial). |
| K framework | https://kframework.org/ | Operational semantics framework. |
| Trail of Bits — Manticore (historical) | https://github.com/trailofbits/manticore | Older symbolic tool; mostly superseded by Halmos/hevm for EVM. |
| ConsenSys — Mythril (historical) | https://github.com/Consensys/mythril | Older symbolic executor. Still occasionally useful for bytecode-only analysis. |
| a16z blog — Halmos intro | https://a16zcrypto.com/ (search “halmos”) | Motivation and walkthrough. |
| Certora blog | https://www.certora.com/blog | Real-bug write-ups: Silo, Texture, infiniFi, Kamino, Aave V4, Lido dual-governance, Uniswap V4. |
| Trail of Bits — “Building Secure Contracts” | https://github.com/crytic/building-secure-contracts | Has a “Properties” section with manual invariant patterns. |
| Underhanded Solidity Contest writeups | https://github.com/Underhanded-Solidity-Contest | Curated bugs that look benign and are hard to formalize — best test bed for “can my spec catch this?“. |
2. What Formal Verification Is (and Isn’t)
2.1 The mathematical claim
Formal verification proves a statement of the form:
For all inputs
xsatisfying preconditionP(x), the system’s behaviorS(x)satisfies postconditionQ(x, S(x)).
Three things to internalise:
- “For all” is literal. Not “for the 10M inputs Echidna sampled”. Not “for the inputs the human auditor thought of”. For every input in the symbolic input space — typically
uint256, the full2^256set, orbytes calldataof every length up to a bound. - The system is the bytecode, not the source. The prover compiles your Solidity to EVM bytecode (or works on a bytecode trace) and reasons about the actual semantics. This matters: Vyper compiler bugs, Solidity optimizer artifacts, ABI-decoding quirks — all visible.
- Either the prover returns “proved” or it returns a counterexample. A counterexample is the auditor’s gold: a specific assignment of inputs that violates the postcondition. Translate it back to a PoC, file a finding.
The three sub-disciplines:
| Discipline | What it does | Best tool for EVM | Soundness scope |
|---|---|---|---|
| Bounded Model Checking (BMC) | Unrolls execution to depth N; checks property over all paths of length ≤ N | Halmos, hevm | Sound up to bound N. Anything beyond N is unchecked. |
| Symbolic execution | Explores all symbolic paths; uses SMT to decide path conditions | Halmos, hevm, Manticore (legacy) | Sound when termination is reached; otherwise depends on bounds. |
| Inductive theorem proving | Proves P holds in initial state and is preserved by every transition | Certora (auto-induction via SMT), K/KEVM (interactive) | Sound for unbounded executions when induction succeeds. |
2.2 What FV is not
A surprising fraction of audit clients (and not-so-junior auditors) confuse FV with adjacent practices. Be precise.
- FV is not bug-finding by exhaustive testing. That’s fuzzing. Fuzzing samples; FV proves over all of input space.
- FV is not “running a verifier”. It’s writing specifications. The verifier is the easy part. The spec — the precise statement of what correct means — is where 80% of the value and 100% of the difficulty lives.
- FV is not type-checking on steroids. The Solidity compiler proves syntactic things; FV proves semantic properties of a specific contract.
- FV is not Slither + magic. Slither is pattern matching on a syntactic representation. It’s pure heuristics. FV reasons about state and inputs.
- FV does not prove your protocol is secure. It proves the properties you wrote down hold. If the property is wrong, weak, or missing, FV says nothing about that gap. The wrong-spec failure mode is the dominant one in practice.
2.3 The “what does this prove?” table
When reading a Certora or Halmos engagement report, the question that separates the senior from the junior is: “what does this prove, and what does this not prove?”
| Claim from a spec | What it proves | What it does NOT prove |
|---|---|---|
assert balanceOf(user) <= totalSupply() after every function | Balances stay ≤ supply across all single-tx executions | Multi-tx atomic-bundle attacks (flash-loan composition), externally-imposed states |
invariant totalSupply == sum(balances) | The classic accounting equation holds in all reachable states | That total supply itself is correct (mints could be unauthorized but still preserve sum) |
rule onlyOwnerCanPause — pause() reverts unless caller == owner | The auth check is enforced for pause() | That owner is correctly set, that owner can’t be hijacked via other paths |
rule deposit increases shares | deposit() strictly increases the receiver’s share balance | That the amount of shares is correctly priced |
Auditor’s reflex: every CVL rule should be paired with a one-line “and this would still allow the following exploit” note. If you can’t think of any, your spec is probably too tight (vacuously true) or too loose (you missed a property).
3. When FV Wins Over Fuzzing
Fuzzing and FV are not substitutes; they are complements. They cover different parts of the property space.
3.1 The four regimes where FV wins clearly
3.1.1 Tight invariants that must hold over all paths
“The vault is always solvent:
totalAssets() >= totalLiabilities().”
Fuzzing can show this fails in some trace. It cannot show it never fails. If the property is tight — the bound is sharp, and there are many edge inputs that approach it — fuzzing will hit the bound but rarely cross it. FV proves the bound holds or finds the exact path that crosses it.
Example — an ERC-4626 vault’s convertToAssets(shares) must round down in favor of the vault. The bug class is a single division that rounds up in a corner. Fuzzing samples one rounding direction; FV proves it for all share/asset ratios.
3.1.2 Counterexamples are statistically sparse
If the bug requires a specific combination — e.g., (reserveA, reserveB, swapIn) such that an intermediate computation underflows by exactly 1 wei — random fuzzing has a probability 1/2^k of hitting it, where k is the number of constrained bits. For k = 96, you’d need 2^96 ≈ 10^29 runs to hit it once on average. Echidna runs perhaps 10^7 cases per hour. Fuzzing will never see this bug.
Examples:
- Kyber Elastic (2023): swap math had a corner involving tick-crossing with a specific liquidity delta. Probability under random fuzz: vanishing.
- Curve
virtual_priceinterim-state read: requires entering during a specific re-entrant moment. - Compound COMP distribution (2021): bug triggered by specific index increment overflow.
FV reasons about symbolic values, so the corner is one path among finitely many — not a needle in a haystack.
3.1.3 High-value invariants where “almost always holds” is worthless
Solvency. Accounting conservation. Access control. These either hold or the protocol is dead. “Holds with 99.9999% probability under fuzzing” is meaningless when the protocol holds 300M.
For these, the FV cost is paid back the first time an auditor surfaces a Certora reports VIOLATED and ships a counterexample. That has happened repeatedly at Aave, Maker, Lido, Compound. [verify with current Certora blog write-ups]
3.1.4 Regression-safe re-engagement
A protocol fixes a bug. The fix is one line. How do you know the fix didn’t create a new bug, or fail to address a sibling bug?
- Unit test: only checks the specific input that was buggy. Sibling cases unverified.
- Fuzz: random sampling around the fix. Maybe hits, maybe not.
- FV: re-run the spec. If “no unauthorized mint” was proven before and after the fix, you have a mathematical guarantee that the fix preserves the invariant.
This is why Certora is heavily used for upgrades: the spec is a contract between the team and the verifier. Each new version must keep passing.
3.2 The fuzzing-still-wins regimes (don’t oversell FV)
| Regime | Why fuzzing wins |
|---|---|
| Exploratory unknown-unknowns | You don’t know what the invariant is yet. Fuzzing surfaces unexpected reverts and divergences that suggest new invariants to write. |
| Loop-heavy iterative code | FV must unroll loops. Bound = N means anything ≥ N is unverified. Fuzzing happily runs the loop. |
| External-call-heavy contracts | FV must abstract or summarise external code (e.g., DISPATCHER); abstractions weaken the proof. Fuzzing forks mainnet and uses the real callee. |
| Time / block-dependent logic | FV reasoning about block.timestamp across many txs requires careful environment modeling; fuzzing just bumps time. |
| Throughput | A modest fuzz harness does 10^6 runs/min; Certora may take 30 min to prove one rule. For an iterating dev loop, fuzz is faster. |
3.3 The pipeline (the right way to use both)
flowchart LR A[Invariant list<br>from §6 of Tuan-15] --> B[Unit test<br>concrete cases] B --> C[Foundry invariant<br>handler-based fuzzing] C --> D[Echidna / Medusa<br>overnight fuzz runs] D --> E[Halmos check_*<br>BMC on hotspots] E --> F[Certora CVL<br>protocol-wide invariants] F --> G[Regression<br>re-run on every PR] style A fill:#e6f3ff style F fill:#c2f0c2 style G fill:#c2f0c2
Cheap → expensive, left → right. Don’t skip ahead. Fuzz finds counterexamples in seconds; surface them, fix them, then try to formalize.
4. Certora Deep — CVL, Rules, Hooks, Ghosts
Certora is the commercial standard. Its sweet spot is inductive invariants over unbounded executions — properties that must hold across any sequence of any functions in any order. That coverage is impossible with BMC tools like Halmos.
4.1 The toolchain shape
contract.sol ────► Solidity compiler ────► Bytecode + ABI
spec.spec ────► certoraRun CLI ────► upload to cloud
│
▼
SMT solver (Z3/CVC5)
│
┌──────────────────────┼──────────────────────┐
▼ ▼ ▼
PROVED rule VIOLATED rule (CEX) TIMEOUT
call trace + storage tune spec or
state + msg.sender increase budget
The Certora cloud runs the actual prover. The local CLI uploads bytecode + spec and downloads results. Free tier exists with limited monthly compute; commercial tiers scale.
4.2 CVL anatomy — the methods block
Every CVL spec starts by declaring which Solidity functions it can reason about and how to model each.
methods {
// Pure-ish view function with no environment dependency
function totalSupply() external returns (uint256) envfree;
function totalAssets() external returns (uint256) envfree;
function balanceOf(address) external returns (uint256) envfree;
// State-changing functions: need an env (block context)
function deposit(uint256, address) external returns (uint256);
function redeem(uint256, address, address) external returns (uint256);
// External-contract calls: summarise behaviour
function _.transferFrom(address, address, uint256) external => DISPATCHER(true);
function _.balanceOf(address) external => DISPATCHER(true);
// For un-summarisable external code, abstract as nondeterministic
function someOracle.price() external returns (uint256) => NONDET;
}envfree is a load-bearing keyword. It says: “this function’s return value does not depend on msg.sender, block.timestamp, etc.” If the prover proves this, calls to the function in rules don’t need an env argument — your specs get much cleaner. If it can’t prove envfree, your declaration fails and you must pass env everywhere.
Summaries for external functions:
| Summary | Semantics | When to use |
|---|---|---|
DISPATCHER(true) | When the contract’s address matches a known contract, dispatch to its code; otherwise nondeterminstic | Most ERC-20 calls — the test harness provides concrete implementations |
DISPATCHER(false) | Same, but if no match, havoc state | Default for cautious modelling |
NONDET | The function returns an arbitrary value; no state changes outside this contract | When external is fully opaque (oracle, off-chain) |
ALWAYS(<expr>) | The function returns the constant expr | When you know the return is fixed for the proof |
AUTO | Default — Certora picks based on context | Use only when you’ve checked it picked sensibly |
Function summary => myCVLFunction(args) | Replace with CVL-defined helper | Heavy customisation |
4.3 Writing a rule — parametric vs concrete
Concrete rule — one named function, one property:
rule depositMintsShares(uint256 assets, address receiver) {
env e;
// Preconditions
require receiver != 0;
require receiver != currentContract;
require assets > 0;
uint256 sharesBefore = balanceOf(receiver);
uint256 mintedShares = deposit(e, assets, receiver);
uint256 sharesAfter = balanceOf(receiver);
assert sharesAfter == sharesBefore + mintedShares,
"receiver's share balance must increase by minted amount";
assert mintedShares > 0,
"non-zero deposit must mint non-zero shares";
}Parametric rule — quantified over all methods:
// Prove that NO function ever increases totalSupply except the authorised mint paths
rule onlyAuthorisedMint(method f) filtered {
f -> !f.isView && f.selector != sig:mint(address,uint256).selector
&& f.selector != sig:deposit(uint256,address).selector
} {
env e;
calldataarg args;
uint256 supplyBefore = totalSupply();
f(e, args);
uint256 supplyAfter = totalSupply();
assert supplyAfter <= supplyBefore,
"non-mint function increased totalSupply — unauthorised mint?";
}The method f plus the filter block tell Certora to re-prove this rule once per function in the contract (except the two explicitly authorised ones). One spec, full coverage. This is the killer feature: you specify the property once, and the prover sweeps every entry point. Most reentrancy and access-control bugs surface this way: some obscure function turns out to mutate state that “shouldn’t” be mutated, and the parametric rule catches it.
4.4 Invariants — inductive properties
A CVL invariant is a predicate that must hold in every reachable state. Certora proves it by induction:
- Base case: the predicate holds in the initial state (after constructor).
- Inductive step: assuming it holds before a function call, prove it still holds after.
invariant totalSupplyEqualsSumOfBalances()
totalSupply() == sumOfBalances();where sumOfBalances is a ghost (next section). Certora attempts to discharge the inductive step automatically for every function. If a function violates it, you get a counterexample — that function is the bug, even if it’s not the function the developer “thought” was about supply.
Preserved blocks — when the prover can’t prove the inductive step alone, you add hints:
invariant totalSupplyEqualsSumOfBalances()
totalSupply() == sumOfBalances()
{
preserved transfer(address to, uint256 amount) with (env e) {
require e.msg.sender != to; // self-transfer corner ruled out
require balanceOf(e.msg.sender) >= amount;
}
preserved {
require totalSupply() < 2^128; // global bound to avoid SMT timeout
}
}preserved clauses add additional preconditions to the inductive step per method (or unrestricted). They are the auditor’s escape hatch when the prover times out — you trim the input space to what’s actually relevant.
Caveat: every require in a preserved block is a trust assumption. The invariant is “proved given these assumptions”. A senior auditor flags any preserved-block requirement that looks suspicious — it can be hiding the bug.
4.5 Ghost variables and hooks — shadow state
Ghosts are CVL-only variables maintained alongside contract storage. They model abstract quantities the contract doesn’t directly track.
// Ghost: total amount ever deposited (across all users, all time)
ghost mathint totalDepositedEver {
init_state axiom totalDepositedEver == 0;
}
// Hook: every time a user's deposit balance increases, bump the ghost
hook Sstore deposits[KEY address user] uint256 newVal (uint256 oldVal) {
totalDepositedEver = totalDepositedEver + newVal - oldVal;
}
// Rule using the ghost as an accounting check
invariant solvency()
to_mathint(totalAssets()) >= totalDepositedEver - totalWithdrawnEver;mathint = unbounded mathematical integer (no overflow). Use this for ghosts; only convert to/from uint256 when interacting with the contract.
Hooks instrument the EVM-level operations:
Sstore— fires on every storage write to the named slot/key.Sload— fires on every read.CALL,STATICCALL,DELEGATECALL— fires on external calls.
The hook runs inside the prover’s symbolic simulation, not at run time. There’s no gas cost. Ghosts are pure spec.
Pattern: sum-of-mapping ghost (the workhorse for ERC-20-style invariants):
ghost mathint sumOfBalances {
init_state axiom sumOfBalances == 0;
}
hook Sstore _balances[KEY address user] uint256 newBal (uint256 oldBal) {
sumOfBalances = sumOfBalances + newBal - oldBal;
}
// Now the totalSupply invariant is:
invariant totalSupplyEqualsSumOfBalances()
to_mathint(totalSupply()) == sumOfBalances;Once you have this ghost, you can prove conservation across every function in 5 lines of CVL.
4.6 Common rule patterns — the auditor’s reusable library
4.6.1 Monotonicity
“Once paused, the contract never unpauses without
onlyAdminre-entering.” “User’s nonce is monotonically increasing.”
rule nonceMonotone(address user, method f) filtered { f -> !f.isView } {
env e;
calldataarg args;
uint256 nonceBefore = nonces(user);
f(e, args);
uint256 nonceAfter = nonces(user);
assert nonceAfter >= nonceBefore;
}4.6.2 No-magic-mint / no-unauthorised-state-change
“totalSupply never increases except via
mintcalled byMINTER_ROLE.”
rule noUnauthorisedMint(method f) filtered { f -> !f.isView } {
env e;
calldataarg args;
uint256 supplyBefore = totalSupply();
f(e, args);
uint256 supplyAfter = totalSupply();
assert supplyAfter > supplyBefore =>
(f.selector == sig:mint(address,uint256).selector &&
hasRole(MINTER_ROLE(), e.msg.sender));
}4.6.3 Conservation of value
“After any swap,
tokenA.reserve * tokenB.reserve >= k_before(constant-product invariant, with fee adjustment).”
This is the AMM solvency rule. Direct CVL translation, modulo fee-math precision.
4.6.4 Role transitions
“Only the current admin can grant
DEFAULT_ADMIN_ROLEto a new account.”
rule onlyAdminGrantsAdmin(address account, method f, env e) {
bool wasAdmin = hasRole(DEFAULT_ADMIN_ROLE(), account);
calldataarg args;
f(e, args);
bool isAdminNow = hasRole(DEFAULT_ADMIN_ROLE(), account);
assert (!wasAdmin && isAdminNow) =>
hasRole(DEFAULT_ADMIN_ROLE(), e.msg.sender);
}4.6.5 Function-pair inverse
“
depositfollowed byredeemof the resulting shares returns the user to their starting position (within rounding).”
rule depositRedeemInverse(uint256 assets, address user) {
env e;
require user != currentContract && user != 0;
require assets > 0;
uint256 startBal = underlying.balanceOf(user);
uint256 shares = deposit(e, assets, user);
uint256 redeemed = redeem(e, shares, user, user);
uint256 endBal = underlying.balanceOf(user);
// Allow 1-wei rounding tolerance (always in favour of vault)
assert endBal <= startBal;
assert endBal + 1 >= startBal - 1;
}This is the rule that catches inflation attacks, donation bugs (Euler 2023’s class), and rounding-direction errors.
4.7 Sanity checks (built-in)
Certora ships built-in sanity rules to catch the most embarrassing class of false-positive: vacuous specs. A vacuous rule is one whose preconditions are unsatisfiable — Certora “proves” it because no input even reaches the assertion.
// VACUOUS — preconditions contradict each other
rule oops(uint256 x) {
require x > 100;
require x < 50;
assert someFunction(x) == 0; // proved trivially
}Run with --rule_sanity basic (or advanced) to surface these. Every Certora engagement should turn sanity on. A “passed” rule that’s vacuous is worse than a failing rule — it gives false security.
4.8 CVL — limitations and gotchas
| Limitation | Mitigation |
|---|---|
| SMT timeout on complex specs | Tighten preserved blocks; bound mathint; split rule into smaller rules; use --smt_timeout with care |
| Unrolling loops | Set --loop_iter N for BMC of bounded loops; for unbounded loops, prove an invariant about the loop body instead |
| External calls to unknown code | Use DISPATCHER(true) with concrete mocks; or NONDET and accept the abstraction |
block.timestamp reasoning across many txs | Add explicit env constraints on time progression |
| Storage layout assumptions | Hooks reference slots by Solidity name (balances[KEY ...]), but be aware that proxy upgrades may shift layout — re-run on each version |
| Floating-point / fixed-point math | Certora prefers integer math; mathint covers most of it. Fixed-point bugs sometimes need careful normalisation in the spec. |
| Cost | Free tier exists with monthly compute quota; commercial pricing per protocol. Plan for 1-2 weeks of CVL ramp-up before productivity. |
4.9 Reading a counterexample
A Certora “VIOLATED” report comes with:
- Call trace — the exact sequence of function calls.
- Storage state — values of every read storage slot at each step.
msg.sender/msg.value/block.timestamp— the env at each call.- The assertion that failed — your own assert.
The auditor’s job: translate this to a concrete PoC. Usually forge test with the exact inputs Certora chose, ideally reproduced in a test that fails before the fix and passes after.
Failure mode: the counterexample is in a corner that the developer dismisses as “unreachable in practice”. Sometimes they’re right (the corner requires a state that, by construction, never occurs because of an external invariant). Sometimes they’re wrong, and the auditor’s job is to demonstrate, with a mainnet-fork PoC, that the state is reachable. Never accept “this can’t happen” without a written argument that itself becomes a trust assumption in the audit report.
4.10 Real audits using Certora — recent track record
[verify all of these with current Certora blog]
- Aave V4 — multi-month engagement, deep AMM and lending invariants.
- Compound — CTKO accounting, distribution math.
- MakerDAO / Dai — global solvency invariants; PSM-specific rules.
- Lido — dual-governance system; staking-router invariants. Found governance bugs pre-mainnet.
- Uniswap V4 — hook composability; found vulnerabilities involving malicious hooks that fuzzing missed.
- Morpho — lending vault invariants.
- Silo / Texture / infiniFi / Kamino — multiple bugs caught pre-deployment per Certora blog write-ups.
Pattern: the higher the TVL and the more complex the accounting, the more lopsided the cost/benefit toward FV.
5. Halmos Deep — BMC Against Foundry Tests
Halmos (from a16z) is the open-source counterpart to Certora for a large class of single-tx properties. It reads Foundry-style tests, replaces concrete inputs with symbolic ones, and uses an SMT solver to explore all paths.
5.1 Installation and first run
# Recommended: uv (Astral) for Python tooling
curl -LsSf https://astral.sh/uv/install.sh | sh
uv tool install --python 3.12 halmos
# Verify
halmos --version
# halmos 0.3.x [verify]# In a Foundry project
cd my-foundry-project
forge build
halmosBy convention, tests Halmos runs are prefixed check_* (not test_*). This separates pure Foundry tests from symbolic ones; both can co-exist.
5.2 The check_* pattern
// test/halmos/CheckVault.t.sol
// SPDX-License-Identifier: MIT
pragma solidity ^0.8.20;
import "forge-std/Test.sol";
import "halmos-cheatcodes/SymTest.sol";
import "../src/Vault.sol";
contract CheckVault is Test, SymTest {
Vault vault;
function setUp() public {
vault = new Vault();
}
/// @dev Property: balanceOf(user) is always ≤ totalSupply()
function check_balanceWithinSupply(address user) public {
// Symbolic precondition: avoid trivial reverts
vm.assume(user != address(0));
// The property
assert(vault.balanceOf(user) <= vault.totalSupply());
}
/// @dev Property: deposit followed by withdraw is value-preserving
function check_depositWithdrawRoundtrip(uint256 amount, address user) public {
vm.assume(amount > 0 && amount < 2**128);
vm.assume(user != address(0) && user != address(vault));
uint256 before = vault.balanceOf(user);
vm.prank(user);
vault.deposit(amount);
vm.prank(user);
vault.withdraw(amount);
uint256 afterBal = vault.balanceOf(user);
assert(afterBal == before);
}
}Run:
halmos --function check_balanceWithinSupply --loop 4
halmos --function check_ # all check_* functions5.3 Symbolic primitives — svm cheatcodes
Halmos provides halmos-cheatcodes/SymTest.sol with helpers:
import "halmos-cheatcodes/SymTest.sol";
contract CheckSym is Test, SymTest {
function check_thing() public {
uint256 x = svm.createUint256("x"); // fully symbolic uint256
bytes memory data = svm.createBytes(64, "data"); // 64-byte symbolic bytes
address addr = svm.createAddress("addr"); // symbolic address
bool flag = svm.createBool("flag");
// Use as ordinary values; Halmos tracks them symbolically
if (flag && x > 100) {
target.foo{value: 0}(addr, data, x);
}
assert(/* something */);
}
}svm.createXXX are the load-bearing primitives. Function parameters are also symbolic by default — for many properties, that’s enough.
5.4 Loop unrolling and bounds
Halmos is bounded — it doesn’t reason about unbounded loops automatically. Configure:
halmos --loop 8 # unroll loops up to depth 8
halmos --array-lengths 0,1,2,32 # try arrays of these lengths
halmos --solver-timeout-assertion 60000 # 60s per assertionIf your code has a loop with N iterations and you set --loop 4, Halmos verifies paths with ≤ 4 iterations and silently skips longer ones. This is the most common source of false-pass with Halmos. Read the output: it lists paths cut off by the bound.
5.5 Halmos vs Certora — what to use when
| Property type | Halmos | Certora |
|---|---|---|
| Single-tx pre/post over symbolic inputs | Excellent (free, fast iteration) | Overkill |
| Pure math library (TickMath, FullMath, ABDK) | Excellent | Excellent |
| Access control matrix | Excellent | Excellent |
| Multi-tx invariants (state machine) | Limited (need to manually compose) | Native (invariants are first-class) |
| Inductive invariants over unbounded executions | No | Yes |
| Reasoning about ghost / shadow state | No | Yes (ghosts + hooks) |
| Loop verification beyond a depth | No (bounded only) | Limited (needs loop invariants) |
| Cost | Free | Commercial / limited free tier |
Rule of thumb: Halmos for tight, single-function properties you can express as assert(<property>) after a symbolic call. Certora for the global accounting properties that hold across any sequence of any function calls.
5.6 Halmos limitations
| Limitation | Mitigation |
|---|---|
| Loop bound | Set explicit --loop N; never trust the default for production specs |
| External calls to unknown code | Mock the callee with a concrete contract; or use vm.mockCall |
| Storage state explosion | Reduce state via setUp — start from a constrained state |
block.* symbolic | Constrain vm.warp / vm.roll to specific values where appropriate |
| Slow solver on rich symbolic arrays | Use --array-lengths 0,1,2 to cover the relevant cases |
6. K Framework, KEVM, Kontrol
The K framework is a meta-tool for defining the operational semantics of any programming language. Runtime Verification (RV) has used K to define the EVM itself (KEVM) and to verify properties of contracts against the full EVM semantics — every opcode, every gas cost, every edge case.
6.1 What K gives you that Certora doesn’t
Certora abstracts the EVM via Solidity-level semantics. That works for 99% of contracts but fails for:
- Compiler-bug-resistant proofs — proving a property against the actual bytecode independent of any compiler.
- EVM specification proofs — proving properties of EVM precompiles, EIPs, opcodes.
- Cross-language verification — Vyper, Huff, Yul all compile to EVM; K reasons at the EVM layer.
KEVM has been used to formally verify:
- Casper FFG (Ethereum PoS finality) — pre-merge spec verification.
- ERC-20 and ERC-721 reference implementations — KEVM-verified token semantics.
- EIP-7212 (P-256 precompile) — verified semantics. [verify]
- Beacon Chain components — selected pieces. [verify]
- Uniswap V1 math — early proof, historical interest.
- DEX, MakerDAO components — RV engagement work.
6.2 Kontrol — KEVM meets Foundry
For application auditors, raw K is overkill. RV’s Kontrol wraps KEVM in a Foundry-compatible interface, similar to Halmos but with K’s deeper-semantic backend.
# Sketch — verify by reading current Kontrol docs
kontrol build
kontrol prove --match-test test_invariantKontrol’s advantage: inductive proofs over unbounded loops (Halmos can’t), with a more rigorous EVM model than Certora’s. Disadvantage: steeper learning curve and commercial pricing.
When to consider Kontrol over Certora:
- The protocol relies on subtle EVM semantics (gas accounting in payment systems, precise revert behaviour, precompile-driven code).
- You need a proof against bytecode independent of the source language.
- You’re verifying an EIP or a protocol-layer change.
For routine application audits, Certora + Halmos covers >95% of practical needs. KEVM/Kontrol is the specialist tool.
6.3 K framework workflow (for the curious)
K specs are rewrite rules over configurations (state). The auditor writes:
rule <k> X +Int Y => Z </k>
requires Z ==Int X +Int Y
This is closer to interactive theorem proving than push-button verification. Plan months, not weeks, to become productive.
7. hevm — The Open-Source Symbolic Engine
hevm (originally part of dapptools, now maintained by the Ethereum Foundation’s formal-methods team) is a Haskell implementation of the EVM optimized for symbolic execution and equivalence checking.
7.1 What hevm offers
- Symbolic execution of arbitrary EVM bytecode — no source needed.
- Foundry-compatible test runner —
hevm testruns Forge tests symbolically. - Equivalence checking — prove two bytecodes have identical behaviour for all inputs. Useful for verifying compiler optimisations, refactors, gas-optimised rewrites.
7.2 Where hevm fits
- Bytecode-only contracts — when source is unavailable (rare in audits, but happens on forks).
- Compiler-output verification — confirm a Yul rewrite is equivalent to original Solidity.
- As a backend — Halmos and other tools historically borrowed pieces from hevm/dapptools heritage.
7.3 Practical caveat
hevm is excellent technology with smaller community adoption than Halmos or Certora. Use it when:
- You need equivalence checking (no other tool offers this for EVM out-of-the-box).
- You’re verifying bytecode without source.
- You’re already in the Haskell ecosystem.
Otherwise, Halmos is the smoother on-ramp for symbolic testing.
8. Mythril & Manticore — Historical Context
The 2018-era symbolic execution wave produced:
- Mythril (ConsenSys) — bytecode-level symbolic analysis with built-in detectors for common bug classes. Still occasionally used for one-off bytecode review.
- Manticore (Trail of Bits) — generic symbolic execution engine, EVM module. Largely superseded by Halmos for EVM use cases.
Auditor’s view: don’t reach for these as a first choice in 2026. Halmos has the better UX, hevm has the better foundation, Certora has the better proof power. Mythril and Manticore are part of the lineage; cite them in the methodology section of an audit report only when there’s a specific reason.
9. The Auditor’s Four-Phase Workflow
The point of this entire chapter is not “learn three tools” but “integrate FV into a normal audit”. The workflow:
9.1 Phase 1 — Identify invariants
This is the work from Week 15 §6. FV cannot tell you what to prove; it can only check what you claim.
Sources, in priority order:
- Whitepaper / protocol design doc — explicit invariants (“the system is always solvent”).
- NatSpec — function-level invariants the developer documented.
require/revertstatements — implicit invariants the developer enforced.- Test suite — assertions are invariants the developer expected.
- Your own threat model — “if X stopped being true, who profits?”
Produce a written list:
# Invariants — <Project>
## Global
- G1: totalSupply() == Σ balances[i] for all i
- G2: totalAssets() ≥ totalLiabilities()
- G3: Only DEFAULT_ADMIN_ROLE can change DEFAULT_ADMIN_ROLE
## Per-module: Vault
- V1: shares minted on deposit ≥ shares burned on equivalent redeem (rounding favours vault)
- V2: convertToAssets is monotonic non-decreasing in shares
- V3: redeem reverts when paused
...
Each line is a candidate for a CVL rule, a Halmos check_*, or both.
9.2 Phase 2 — Spec in the right tool
Decision table:
| Property shape | Best tool |
|---|---|
”Function f returns X iff Y, for any input” | Halmos check_* |
| ”No function can move state from A to B except via path Z” | Certora parametric rule |
| ”Across any sequence of function calls, P holds” | Certora invariant |
| ”Pure math library never overflows for input in range” | Halmos (cheap) or Certora |
| ”Storage layout consistent across upgrade V1 → V2” | OZ upgrades-plugin static check (cheaper than FV) |
| “Solvency under flash-loan attack” | Foundry fuzz first, then Certora rule with multi-tx env |
9.3 Phase 3 — Tighten the spec until silence
Workflow:
- Write the rule. Run the prover.
- Counterexample: did you find a bug, or is your spec too loose? Inspect the CEX:
- Real bug: file the finding, fix the code, re-run.
- Spec too loose: tighten preconditions (a
requirein CVL orvm.assumein Halmos) and re-run. - Spec too tight (vacuous): a sanity check should catch this; relax.
- Timeout: the spec is true but the solver can’t prove it in time. Decompose: split the property, add intermediate lemmas, bound mathints, or add preserved blocks.
- Proved: rule passes. Move to the next.
The single most important habit: every time the prover says “proved”, ask yourself: would this still pass if I deleted the function body? If yes, the spec is vacuous. Add a sanity check or a complementary “this function actually does something” rule.
9.4 Phase 4 — Maintain across refactors
Specs are part of the code. They live in certora/ or test/halmos/ and are re-run on every PR. The CI pipeline:
# .github/workflows/fv.yml — conceptual
jobs:
halmos:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: forge build
- run: halmos --function check_ --loop 8
certora:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: pip install certora-cli
- run: certoraRun certora/run.confA PR that breaks a CVL rule should fail CI as loudly as a broken unit test. Specs are how the protocol team and the auditor talk to each other across upgrades.
10. Lab
10.1 Setup
mkdir -p ~/web3-sec-lab/bonus-fv && cd ~/web3-sec-lab/bonus-fv
forge init --no-commit lending-fv
cd lending-fv
# Halmos
curl -LsSf https://astral.sh/uv/install.sh | sh
uv tool install --python 3.12 halmos
halmos --version
# Halmos cheatcodes lib
forge install a16z/halmos-cheatcodes --no-commit
# Certora CLI (optional; free tier requires account)
pip install certora-cli
solc-select install 0.8.24 && solc-select use 0.8.2410.2 The target — a small lending contract
// src/MiniLending.sol
// SPDX-License-Identifier: MIT
pragma solidity ^0.8.20;
contract MiniLending {
address public admin;
mapping(address => uint256) public balanceOf;
uint256 public totalSupply;
bool public paused;
modifier onlyAdmin() {
require(msg.sender == admin, "not admin");
_;
}
modifier whenNotPaused() {
require(!paused, "paused");
_;
}
constructor() {
admin = msg.sender;
}
function deposit(uint256 amount) external whenNotPaused {
balanceOf[msg.sender] += amount;
totalSupply += amount;
}
function withdraw(uint256 amount) external whenNotPaused {
require(balanceOf[msg.sender] >= amount, "insufficient");
balanceOf[msg.sender] -= amount;
totalSupply -= amount;
}
function mint(address to, uint256 amount) external onlyAdmin {
balanceOf[to] += amount;
totalSupply += amount;
}
function pause() external onlyAdmin {
paused = true;
}
function unpause() external onlyAdmin {
paused = false;
}
// The subtle bug we'll plant in Lab 3 lives here.
function airdrop(address to, uint256 amount) external {
balanceOf[to] += amount; // ← OOPS: totalSupply not updated
}
}The airdrop is deliberately buggy — it increments a balance without authorisation and without updating totalSupply. Echidna with random inputs will rarely care (no obvious assertion fires). Halmos and Certora will both surface it instantly when given the right property.
10.3 Lab 1 — Halmos check_*: balance ≤ totalSupply
// test/halmos/CheckLending.t.sol
// SPDX-License-Identifier: MIT
pragma solidity ^0.8.20;
import "forge-std/Test.sol";
import "halmos-cheatcodes/SymTest.sol";
import "../../src/MiniLending.sol";
contract CheckLending is Test, SymTest {
MiniLending lending;
function setUp() public {
lending = new MiniLending();
}
/// @notice Property: balanceOf(user) <= totalSupply, after any single
/// user-callable function call.
function check_balanceWithinSupply(
address caller,
address user,
uint256 amount,
uint8 fnSelector
) public {
vm.assume(caller != address(0));
vm.assume(user != address(0));
vm.assume(amount < 2**128); // avoid trivial overflows
// Start in a state where the invariant holds
require(lending.balanceOf(user) <= lending.totalSupply());
// Pick a function to invoke
vm.prank(caller);
if (fnSelector == 0) {
try lending.deposit(amount) {} catch {}
} else if (fnSelector == 1) {
try lending.withdraw(amount) {} catch {}
} else if (fnSelector == 2) {
try lending.airdrop(user, amount) {} catch {}
}
// Invariant must still hold
assert(lending.balanceOf(user) <= lending.totalSupply());
}
}Run:
halmos --function check_balanceWithinSupply --loop 4 -vExpected: Halmos returns Counterexample for the airdrop path:
[FAIL] check_balanceWithinSupply(...)
Counterexample:
caller = 0x...
user = 0x...
amount = 1
fnSelector = 2
Path: airdrop(user, 1) was called; balanceOf[user] became 1; totalSupply stayed 0;
Assertion: balanceOf(user) <= totalSupply() — VIOLATED (1 <= 0 is false)
Why Echidna might miss this: an Echidna campaign without an explicit property_balanceWithinSupply won’t surface this. With the property, it likely does find it — but only because the bug is one operation away. In a real protocol where airdrop is buried behind a setter / role / time check, the search space explodes and fuzzing misses.
Patch: in airdrop, also do totalSupply += amount;, and add an onlyAdmin modifier. Re-run; Halmos reports proved.
10.4 Lab 2 — Certora CVL: no unauthorised mint
This lab is structured so you can either run it against the Certora cloud (free tier) or document the spec without running.
// certora/MiniLending.spec
methods {
function admin() external returns (address) envfree;
function balanceOf(address) external returns (uint256) envfree;
function totalSupply() external returns (uint256) envfree;
function paused() external returns (bool) envfree;
}
// G1: totalSupply equals sum of balances (the workhorse invariant)
ghost mathint sumOfBalances {
init_state axiom sumOfBalances == 0;
}
hook Sstore balanceOf[KEY address user] uint256 newBal (uint256 oldBal) {
sumOfBalances = sumOfBalances + newBal - oldBal;
}
invariant totalSupplyEqualsSumOfBalances()
to_mathint(totalSupply()) == sumOfBalances;
// G2: no function increases totalSupply except mint() called by admin,
// or deposit() called by anyone
rule noUnauthorisedSupplyIncrease(method f) filtered {
f -> !f.isView
} {
env e;
calldataarg args;
uint256 supplyBefore = totalSupply();
f(e, args);
uint256 supplyAfter = totalSupply();
assert supplyAfter > supplyBefore => (
f.selector == sig:deposit(uint256).selector ||
(f.selector == sig:mint(address,uint256).selector && e.msg.sender == admin())
), "supply increased via unauthorised path";
}Run (if you have certora-cli configured):
certoraRun src/MiniLending.sol \
--verify MiniLending:certora/MiniLending.spec \
--solc solc-0.8.24 \
--rule_sanity basic \
--rule noUnauthorisedSupplyIncreaseExpected output:
totalSupplyEqualsSumOfBalances— VIOLATED onairdrop(balances change, supply doesn’t).noUnauthorisedSupplyIncrease— PASSES on the buggy version becauseairdropdoesn’t touchsupplyat all, but fails if we “fix” airdrop naively by also bumpingtotalSupplywithout auth (then any caller can mint).
Both rules together pin down the bug: airdrop either breaks the sum invariant (current) or breaks the authorisation invariant (naive fix). The correct fix is onlyAdmin on airdrop, treating it as another minting function. The audit finding writes itself.
If you don’t want to run: still document the spec in certora/MiniLending.spec and include it in the audit deliverables. The spec is part of the report; running it is a separate operational step.
10.5 Lab 3 — The bug Echidna missed in N runs
Set up a basic Echidna campaign:
// test/echidna/EchidnaLending.sol
// SPDX-License-Identifier: MIT
pragma solidity ^0.8.20;
import "../../src/MiniLending.sol";
contract EchidnaLending {
MiniLending lending;
constructor() {
lending = new MiniLending();
}
function echidna_balance_within_supply() public view returns (bool) {
return lending.balanceOf(address(this)) <= lending.totalSupply();
}
}# echidna.yaml
testMode: property
testLimit: 1000000
seqLen: 50Run:
echidna test/echidna/EchidnaLending.sol --contract EchidnaLending --config echidna.yamlCritical observation: Echidna’s auto-generated transaction sequences will include airdrop(this, ...) with high probability and will catch this bug — because the contract has only six functions. The bug-by-fuzz / bug-by-FV gap closes when the search space is small.
Now make the bug subtle: wrap airdrop behind a permission gate that’s almost-correct:
mapping(address => bool) public airdropper;
function setAirdropper(address who, bool ok) external onlyAdmin {
airdropper[who] = ok;
}
function airdrop(address to, uint256 amount) external {
require(airdropper[msg.sender] || tx.origin == admin, "not allowed"); // ← bug
balanceOf[to] += amount;
totalSupply += amount;
}The bug: tx.origin == admin is exploitable any time the admin interacts with an attacker contract (classic Week 05 anti-pattern). Plus, the airdropper map starts empty, so the only path is through tx.origin.
Echidna at default config will rarely synthesise the exact sequence “admin calls a contract that re-enters and calls airdrop”. The search space includes choosing msg.sender == admin for the right call out of 2^160 addresses.
Halmos (with caller as a symbolic address) tries all caller values; the tx.origin == admin path is one solver constraint away.
Certora (with a parametric rule “no function makes a non-admin a minter”) flags the tx.origin == admin branch as a path violating the rule.
Demo: run Echidna for 1M sequences — usually no fail unless the admin-EOA is included in targetSenders. Run Halmos / Certora — instant counterexample.
The pedagogical point: FV is not magic, but it eliminates the “I needed to think of the exact sequence” requirement. The auditor states the property, the solver figures out the sequence.
10.6 Lab — Stretch
- Add a re-entry path: make
withdrawcall an externalreceivehook. Write a Halmoscheck_no_reentry_can_drainrule and prove it. (Hint: Halmos will need an explicit external mock for the callback.) - Ghost-based solvency: in Certora, add a ghost
mathint totalDepositedandtotalWithdrawn, then provetotalAssets() >= totalDeposited - totalWithdrawnas an invariant. Add a deliberate accounting bug (forget to decrement onwithdraw); confirm Certora catches it. - Reverse engineer: take the Curve
virtual_priceread-only reentrancy from Tuan-05-Vulnerability-Classes-Part-1 §2.2.4. Write a Halmos rule that captures the property “view returns the same value as a hypothetical fresh state read”. This is hard — by the end, you’ll understand why read-only reentrancy is so widely missed.
11. Anti-patterns (FV-specific, add to the master checklist)
- Spec uses
requireto rule out the bug input. Therequireis a trust assumption. If it’s not tied to an enforced contract precondition, the proof is vacuous in the real world. -
envfreedeclared on a function that readsblock.timestamp. Certora may detect; Halmos won’t. Audit by inspection. - Summary
NONDETon an external call whose return value matters for the property. The spec is trivially true but the real callee may behave differently — false-pass. - Halmos rule with
--loopdefault and an unbounded loop in the contract. Bound is silently applied; spec only covers small N. - CVL rule passes; the function under test is dead code. Sanity check would catch; turn
--rule_sanityon. - CVL invariant proved with a preserved block that effectively assumes the conclusion. Auditor reflex: every preserved-block precondition is a hand-wave to inspect.
- FV done on a stale commit. Specs must be regression-run on the deployed bytecode, not the branch from three weeks ago.
- Treating “proved” as “secure”. It means the spec passed. The spec might be incomplete.
- No fuzz baseline. If you can’t make Echidna or Foundry-invariant break the property in 10M runs first, you may be polishing a spec for a property that’s not even ambiguous.
- One auditor wrote the spec; same auditor reviewed it. Specs need adversarial review the same way code does.
12. Trade-offs
| Decision | Option A | Option B | Auditor’s view |
|---|---|---|---|
| First-line FV tool | Halmos | Certora | Halmos for fast iteration; Certora for protocol-wide invariants. Real shops use both. |
| Spec language proficiency | CVL only | CVL + Halmos + SMT-LIB | Multi-tool is the senior path. CVL alone is fine for years. |
| Free vs commercial | Free tier + Halmos | Full Certora engagement | For independent auditors, free tier is workable for medium engagements. For top-tier audits (Aave-class), commercial. |
| Spec depth | One rule per invariant (terse) | Many small rules + sanity (verbose) | Verbose. Each small rule fails for a clear reason; one big rule fails ambiguously. |
| When to start FV in engagement | Day 1 | After fuzz harness exists | After fuzz. Fuzz first surfaces “obvious” bugs at minimum cost; FV is for the rest. |
Trusting preserved blocks | Liberal | Conservative | Conservative. Every require in a preserved block is a trust assumption that must appear in the report. |
| KEVM / Kontrol | Skip | Adopt | Skip for routine audits. Adopt only when working at protocol/EVM-EIP layer. |
13. Quiz (≥80% to claim FV literacy)
-
Q: What’s the difference between bounded model checking and inductive theorem proving? A: BMC unrolls execution to a fixed depth N and proves the property over all paths of length ≤ N — sound only up to that bound. Inductive theorem proving proves a base case plus inductive step, so the property holds for unbounded execution lengths. Halmos is BMC; Certora invariants are inductive (with SMT-discharged steps).
-
Q: Why does Certora call a rule “vacuous” and why is that a finding? A: A vacuous rule has preconditions that are unsatisfiable — the assertion is trivially true because no input even reaches it. It’s worse than failing because it gives false confidence. Solution: enable
--rule_sanityto flag vacuity. -
Q: You write a CVL invariant
totalSupply() == sumOfBalances. Certora reports VIOLATED ontransfer(a, b, amt). The CEX showsa == b. How do you proceed? A: Self-transfer corner. Either (a) prove the contract guardsa != b, or (b) addpreserved transfer(address from, address to, uint256 amt) with (env e) { require from != to; }and explicitly note in the audit report that self-transfer is excluded from the proof. Then verify self-transfer behaviour separately. -
Q: Halmos reports a
check_*rule asPASSEDbut you suspect a loop bound caused it. What do you do? A: Check the Halmos output for “path truncated due to loop bound” messages. Increase--loopand re-run. If you can’t make the bound large enough due to solver timeout, the rule is only proved up to that depth — flag in the report. -
Q: Why is
DISPATCHER(true)usually preferred overNONDETfor ERC-20 calls in CVL? A:DISPATCHER(true)dispatches to a concrete implementation when one is available in the verification context, allowing the prover to reason about actual ERC-20 semantics (balanceOf,transferFrom).NONDETmakes the call return an arbitrary value, which is too weak — most ERC-20-using rules become vacuous or unprovable. UseNONDETonly when the external is genuinely opaque (oracles, off-chain). -
Q: When would you reach for KEVM / Kontrol over Certora? A: When the property is about EVM semantics rather than Solidity semantics — e.g., verifying a new precompile, an EIP, gas-accounting in a payment protocol, or a property that must hold against the exact bytecode independent of compiler version.
-
Q: A protocol has a 1000-iteration loop computing fee accruals. Can you Halmos this? A: Not directly — Halmos would need
--loop 1000, which likely times out. Strategies: (1) prove an invariant about a single iteration of the loop body (loop-invariant style) and inductively argue the whole loop preserves it; (2) refactor the contract to a closed-form computation; (3) use Certora’s invariant machinery, which handles unbounded iteration via induction. -
Q: Echidna runs 10M cases and finds nothing. Should you trust the protocol? A: No. The negative result is evidence of rarity, not absence. If the invariant is high-value, follow up with Halmos or Certora to prove the property over the symbolic space — or accept that you only have probabilistic confidence.
-
Q: A CVL preserved block contains
require msg.sender != address(0). Is that a problem? A: Almost certainly not —msg.sender == 0is not a reachable EVM state in practice (CALLER opcode never produces zero in normal transactions). Acceptable. The auditor’s reflex remains: every preserved-block constraint must be documented and justified. -
Q: A protocol team says “we have 95% line coverage in fuzz; we don’t need FV”. How do you push back? A: Line coverage measures paths exercised, not states explored. A fuzz that hits every line still may never enter the specific state combination that triggers a bug — and the line-coverage metric is silent about it. FV reasons about the state space, which is the right granularity for accounting invariants. Pushback with a concrete invariant the fuzz cannot prove (e.g., “totalSupply == sum of balances after any sequence of N transactions”), and offer to formalise it.
14. Deliverables
- Lab 1 (Halmos
check_balanceWithinSupply) — buggy version surfaces CEX; patched version proves. - Lab 2 (Certora
noUnauthorisedSupplyIncrease) — spec written; run if free-tier configured. - Lab 3 (the Echidna-miss demo) — written walk-through plus the contrasting Halmos / Certora result.
- Notes: in your own words, the four cases where FV beats fuzzing and the four where fuzzing wins.
- One paragraph: how you would pitch FV to a client deciding between a 50K + $40K-FV audit. What’s the case, what’s the deliverable, what’s the maintenance commitment?
- Update the master audit checklist with the FV anti-patterns from §11.
15. Where this leads
This bonus chapter slots into your workflow at the top end of the rigor scale. Most audits will still rely on Slither + manual review + Echidna + Foundry-invariant — and rightly so. FV is the tool you reach for when:
- The protocol’s TVL is large enough that a single sparse-input bug is unacceptable.
- The accounting math is the primary product, not an incidental.
- The protocol will undergo many upgrades and regression-safe specs are worth their weight.
- The client is willing to fund the engagement.
The complementary chapters are:
- Tuan-Bonus-Fuzzing-Invariant-Advanced — the other half of property-based verification; handler-based fuzzing, stateful invariants, Medusa-specific patterns.
- Tuan-Bonus-ZK-Circuit-Security — circuit-level formal reasoning; under-constrained-bug detection.
- Tuan-15-Audit-Methodology-Tooling — the parent methodology lesson that places FV in the audit pipeline.
Read once for orientation, return when you have a protocol where the math actually has to be right. The first time you ship a Certora VIOLATED finding to a client who would have shipped with a fuzz pass, you’ll understand why the top of the audit market has converged on this toolchain.
Last updated: 2026-05-16 See also: MOC-Web3-Security-Mastery · Tuan-15-Audit-Methodology-Tooling · Tuan-Bonus-Fuzzing-Invariant-Advanced · Tuan-Bonus-ZK-Circuit-Security · invariant-test-templates · References