apogee-tree¶
Clean-room Rust library that parses source code via tree-sitter and emits a Code Property Graph (CPG) — the structural foundation for all CPG query tools.
Synopsis¶
apogee-tree is a library crate, not a standalone CLI. It is consumed
by apogee-manifest (the CLI) and the DuckDB store layer.
use apogee_tree::{CpgBuilder, CpgNode, CpgEdge};
let mut builder = CpgBuilder::new(repo_root)
.set_capture_bodies(true);
builder.add_source(source_bytes, "main.c");
let (cpg, body_content) = builder.build();
Description¶
apogee-tree extracts structural information from source code using tree-sitter parsers. It produces a graph of nodes (functions, classes, variables, calls, parameters, literals) and edges (calls, contains, data_flows_to, defined_by, used_by, arg_passes_to, has_parameter) — the Code Property Graph.
Supported languages (8):
C, C++ (including headers)
Python
Go
Rust
Java
JavaScript (including JSX)
TypeScript (including TSX)
C/C++ parsing model:
Each file is parsed independently — apogee-tree does not follow
#include directives or run the C preprocessor. Header files
(.h, .hpp, .hxx, .hh) are parsed with the same
visitor as their corresponding source files and produce the same
node types.
What is captured:
#includedirectives are emitted asimportnodes (the include path is extracted, but the target file is not inlined)#defineconstants are emitted asmacronodes withis_function_like: false#definefunction-like macros (e.g.,MIN(a, b)) are emitted asmacronodes withis_function_like: trueKernel storage-class annotations (
__init,__exit,__always_inline,__cold,__hot,__noinline) are stripped from function return types so they do not pollute signatures
What is not captured:
Macro invocations that don’t expand to parseable constructs (e.g.,
EXPORT_SYMBOL,MODULE_AUTHOR) are not extracted as nodes — tree-sitter sees the unexpanded sourceConditional compilation (
#ifdef) is not evaluated; both branches are visible to the parser
Node kinds (17):
Kind |
Description |
|---|---|
|
Top-level file scope |
|
Class or struct definition |
|
Function or method definition |
|
Variable declaration or assignment |
|
Function parameter |
|
Function call site |
|
Argument at a call site |
|
Literal value (string, number, etc.) |
|
Return statement |
|
Import or include statement |
|
If/else/switch condition |
|
For/while/do loop |
|
Code block scope |
|
Type alias or typedef |
|
Macro definition |
|
Enum definition |
|
Namespace or module scope |
Edge kinds (7):
Kind |
Description |
|---|---|
|
Parent scope contains child node |
|
Call site resolves to function definition |
|
Function has a parameter |
|
Data flows from source to target |
|
Variable defined by an expression |
|
Value used by a consumer |
|
Call argument maps to callee parameter |
Body capture: when capture_bodies is enabled, each
function’s body text is hashed with Blake3 and stored as a
body_hash attribute. The hash enables change detection
across commits without storing full body text for every
historical snapshot.
Skeleton mode: when hash_only is set on visitors,
body_hash is computed but the full body text is not captured.
Used for historical commit indexing where only structural
identity and change detection are needed.
Reference¶
Crate structure (manifest/tree/):
src/model.rs—CpgNode,CpgEdge,NodeKind,EdgeKind,SourceLocationtypessrc/builder.rs—CpgBuilderwithadd_source,build,build_with_sink,resolve_calls,compute_type_refssrc/visitors/— per-language visitor implementations:c.rs,cpp.rs,python.rs,go.rs,rust_lang.rs,java.rs,js.rs,ts.rssrc/visitors/mod.rs—get_visitor_with_optsfactory
Key APIs:
// Build CPG from source files
let builder = CpgBuilder::new(repo_root)
.set_capture_bodies(true)
.set_hash_only(false);
builder.add_source(bytes, "file.c");
let (cpg, bodies) = builder.build();
// Query the graph
cpg.node_count() // total nodes
cpg.edge_count() // total edges
cpg.nodes() // iterator over CpgNode
cpg.edges() // iterator over CpgEdge
Note
This page is also available as a man page:
man apogee-tree
See Also¶
CPG Engine — CPG query tools built on apogee-tree output
apogee-manifest — CLI that drives apogee-tree