apogee-manifest

Static codebase analysis producing scored JSON manifests and a DuckDB Code Property Graph. Parses 8 languages via tree-sitter.

MCP surface: generate_manifest, rescore_manifest, summarize_manifest, get_manifest_context
Install: just install (builds from manifest/generator/)
apogee-manifest --repo . --label current

The Rust CLI (apogee-manifest) is the sole implementation. It supports multi-branch orchestration, incremental history indexing, and skeleton builds.

Description

The manifest generator performs static analysis of a codebase to produce a scored JSON manifest and a Code Property Graph (CPG) stored in DuckDB. It walks the file tree, parses source files via tree-sitter across 8 languages (C, C++, Python, Go, Rust, Java, JavaScript, TypeScript), extracts structural nodes and edges, and persists them to .apogee/cpg.duckdb.

The output includes:

  • Manifest JSON: metadata, file inventory, symbols, relationships, scores, AI regrade prompts, recommendations

  • DuckDB CPG: nodes (functions, classes, variables, calls), edges (calls, contains, data_flows_to, arg_passes_to), scope summaries (13 safety flags per function), embeddings, and body content (Blake3-hashed)

The Rust CLI (apogee-manifest) supports incremental branch loading (SQL-copy unchanged data from a previously-indexed branch), skeleton builds for historical commits, and per-subsystem history indexing.

Configuration (Rust CLI)

Flag

Description

Default

--repo

Repository root to analyze

.

--label

Human-readable label for the manifest

current

--max-bytes

Skip files larger than this

512000

--ref

Git ref to index (reads via git show)

HEAD

--branch

Branch name to record in CPG metadata

(auto)

--no-capture-bodies

Skip function body hashing

(bodies captured)

--working-dir

Build overlay from uncommitted state

(off)

--orchestrate

Multi-branch indexing mode

(off)

--orchestrate-branch

Branch spec for orchestration (name=ref; repeatable)

(none)

--all-branches

Auto-enumerate remote branches for orchestration

(off)

--remote

Limit --all-branches to a specific remote

(all remotes)

--stale-days

Skip branches older than N days

(no limit)

--exclude

Exclude branches matching glob (repeatable)

(none)

--history-depth

Index N per-subsystem historical commits

(off)

--history-tags

Index tagged releases matching glob

(off)

--history-since

Only index history at or after this ref/tag

(no cutoff)

--history-for-path

Focus history indexing on a specific path

(per-subsystem auto)

Reference

Scoring formula

Three component scores weighted into an overall score:

Dimension

Calculation

Weight

parse_coverage_score

Files parsed successfully / parse candidates

0.4

symbol_coverage_score

Files with symbols / parseable files

0.4

relationship_coverage_score

Files with dependency edges / relationship-capable files

0.2

overall_score

Weighted combination of the three component scores

(derived)

When overall_score >= 95, the manifest is considered high-coverage.

Rust crates (manifest/generator/):

  • apogee-tree — tree-sitter CPG library (8 language visitors, node/edge extraction, body hashing)

  • apogee-analyze — git operations, file walking, import/call extraction

  • apogee-graph — DuckDB store, delta writer, incremental branch loading, embedding computation

  • apogee-cli — CLI entry point, orchestrator, builder

Examples

# Generate a manifest
apogee-manifest --repo . --label current

# Multi-branch indexing
apogee-manifest --repo . --orchestrate \
  --orchestrate-branch main=origin/main \
  --orchestrate-branch feature=origin/feature-x

# Custom label and byte limit
apogee-manifest --repo ~/projects/foo --label production --max-bytes 1048576

Note

This page is also available as a man page: man apogee-manifest

See Also

  • CPG Engine — Code Property Graph built from manifest data

  • apogee-mcp — MCP tools for manifest operations (generate_manifest, rescore_manifest, summarize_manifest, get_manifest_context)