Perigee¶
OTP-supervised chain execution engine. Runs lifecycle chains as isolated, crash-recoverable processes with completion contracts, failure policies, and multi-agent session coordination.
define_chain, execute_chain, checkpoint_chain, resume_chain MCP tools.Description¶
Perigee is the execution engine behind Apogee’s lifecycle chains. When
you call execute_chain through an MCP-connected assistant, the MCP
server routes the request to Perigee over a Unix socket. Perigee manages
the chain session as a supervised OTP GenServer process.
Why Perigee exists: the Python in-memory chain executor works for single-session use, but it can’t isolate failures, run concurrent chains, or survive process crashes. Perigee provides all three — plus completion contracts that verify skill execution before allowing a step to advance.
What Perigee Provides¶
Crash isolation — each chain session is a supervised GenServer process. If one session crashes, others continue unaffected. OTP supervision restarts failed sessions automatically.
Concurrent sessions — dozens of chains run in parallel, each with independent state. Sessions are tracked per-project with automatic cleanup.
Completion contracts — steps can’t just claim “passed.” The
executor verifies that the required skill was actually invoked
(skill_invoked flag) and that expected artifacts exist before
advancing. A step that reports success without doing the work gets
rejected with a completion_contract_violated error.
Failure policies — configurable per failure class:
Failure class |
Description |
Default policy |
|---|---|---|
|
Implementation error |
Retry current step |
|
Test infrastructure problem |
Retry current step |
|
Design question needs resolution |
Backtrack to earlier step |
|
Unrecoverable failure |
Stop the chain |
Control commands — chains support runtime control:
Command |
Description |
|---|---|
|
Suspend execution, keep state |
|
Continue from paused state |
|
Skip current step and advance |
|
Rewind to a prior step (resets downstream) |
|
Enter manual override mode |
|
Terminate the chain |
|
Force-advance past current step |
|
Hand session ownership to another agent |
Checkpoint persistence — chain state is serialized to JSON and stored on disk. Sessions survive process restarts and can be resumed in a different session or by a different agent.
Event bus — state transitions (mode changes, step completions) are broadcast to subscribers. Enables multi-agent coordination where one agent can react to another agent’s chain progress.
Agent transfer — a running chain session can be transferred from one agent to another mid-execution. The receiving agent picks up at the current step with full state.
Configuration¶
Perigee is selected via the APOGEE_EXECUTOR environment variable:
Value |
Backend |
Description |
|---|---|---|
|
Perigee (default) |
Unix socket connection to Perigee daemon |
|
In-memory Python |
Single-session, no crash isolation |
The socket path defaults to /tmp/perigee.sock and can be
overridden with PERIGEE_SOCKET.
Architecture¶
Perigee is implemented in Elixir (~2,500 lines) with a Rust bridge (~660 lines) for transport:
ChainSession — GenServer state machine implementing the full chain lifecycle (start, report, control, checkpoint, resume)
SessionRegistry — maps session IDs and workflow IDs to processes for routing
SessionSupervisor — DynamicSupervisor for per-session processes
CheckpointStore — JSON persistence for chain state
EventBus — project-scoped event subscriptions
SocketListener — Unix socket server accepting client connections
BridgeRouter — routes decoded commands to ChainSession processes
Codec — length-prefixed JSON framing for socket transport
See Also¶
Chain Engine — chain types, conditions, and step definitions
apogee-mcp — MCP tools that route to Perigee
Configuration —
APOGEE_EXECUTORandPERIGEE_SOCKETsettingsHow Apogee Works — Perigee as Layer 2