apogee-proof

TOML-driven test harness engine for verification proof runs and structured reports. Executes test suites, builds proof reports, and tracks surface coverage.

MCP surface: CLI only — no MCP tools. Gate evaluation consumes proof reports.
Install: cargo build -p apogee-proof-harness --release (included with the Apogee server install)
apogee-proof-harness --suite python-engine

Synopsis

apogee-proof-harness [OPTIONS]
apogee-proof-harness --suite <name>
apogee-proof-harness --tag <tag> [--tag <tag>...]
apogee-proof-harness --layer <layer> --safety-gate

Description

The proof harness reads TOML suite definitions from verification_tests/harness/ and dispatches tests. Every test is a [[step]] entry in a TOML file. Results are collected into structured JSON proof reports that the gate system consumes.

The harness supports multiple execution modes: run all suites, filter by suite name, filter by tag, filter by layer, and safety-gated execution that halts on first failure.

Configuration

Flag

Description

Default

--harness-dir

Directory containing TOML suite files

verification_tests/harness/

--suite

Run a specific suite by name

(all suites)

--tag

Filter steps by tag (repeatable)

(no filter)

--group

Filter steps by group

(no filter)

--layer

Filter by verification layer

(no filter)

--min-depth

Minimum proof depth to run

(no minimum)

--safety-gate

Halt on first failure

(off)

--fail-fast

Stop suite on first step failure

(off)

--report-path

Output path for proof report JSON

(auto)

--report-mode

Report format: full or ci-run

full

--timeout

Per-step timeout in seconds

300

--repo-root

Repository root for context

.

Harness Architecture

Key modules:

  • engine.py — TOML loader, step dispatch, proof report generation

  • assertions.py — JSON response matching DSL

  • transports.py — bridge and Unix socket transports

  • orchestration.py — variable capture and injection across steps

  • script_wrapper.pytype="script" step dispatch

Suite schema: each suite is a TOML file with [[step]] entries. See spec/harness_step_schema.md for the full field reference.

Adding a test: add a [[step]] to a suite file, update INDEX.toml, verify with apogee-proof-harness --suite <name>.

Proof Depth Levels

Depth

Description

smoke

Quick sanity check

functional

Feature-level verification

adversarial

Stress and edge-case testing

destructive

Fault injection and recovery

Report Output

The harness produces a version-1 JSON proof report with:

Field

Type

Description

test_id

string

Unique identifier for this proof run

verdict

string

Final outcome: passed, failed, or error

evidence

array

Commands executed, outputs, exit codes

duration

number

Total wall-clock time in seconds

surface

object

Surface coverage summary

Surface coverage tracks which declared proof surfaces have passing evidence:

{
  "declared": ["unit", "integration", "lint"],
  "covered": ["unit", "lint"],
  "coverage_pct": 66.7,
  "missing": ["integration"]
}

A surface is covered when at least one proof command targeting it has a passed verdict. Gate evaluation uses surface coverage to determine proof completeness.

Python API

For programmatic use, the proof harness also exposes a Python API:

from proof_harness.runner import LocalProofRunner
from proof_harness.report import ProofReportBuilder

runner = LocalProofRunner(install_state="fresh-install")
result = runner.run_command(["./scripts/run_unit_tests.sh"])

report = runner.build_report("feature")
report.add_command("run_unit_tests", result)
report.set_outcome("passed")
report.set_counts(total=42, passed=42, failed=0)
report.set_trust("trusted")

Examples

# Run all suites
apogee-proof-harness

# Run one suite
apogee-proof-harness --suite python-engine

# Run E2E layer with safety gate
apogee-proof-harness --layer system-e2e --safety-gate

# Filter by tag, output CI report
apogee-proof-harness --tag smoke --report-mode ci-run

# Custom timeout and report path
apogee-proof-harness --suite integration \
  --timeout 600 --report-path results/proof.json

Note

This page is also available as a man page: man apogee-proof

See Also

  • Gate System — gates that consume proof reports

  • Architecture — proof harness in the component model

  • Glossary — terminology for proof depth, surfaces, and verdicts

  • spec/harness_step_schema.md — TOML step field reference