Docs · pipeline
Pipeline
Four Claude agents, sequenced for shared context
Four specialized agents
Causalist generates a graph by calling Claude Opus 4.7 in four roles. Each agent has its own system prompt constraining output to strict JSONL — one parseable JSON object per line, so the browser can stream nodes / edges into the live force-graph as they emit. Models are configurable per-agent via the New Project modal (defaults to Opus 4.7 for all four; Sonnet 4.6 / Haiku 4.5 also wired).
1. Structure agent
Walks the repo tree, classifies every file into one of seven layers — infra, data, logic, api, ui, test, config — and emits one CausalNode per line.
2. Dependency agent
Receives Structure's node manifest plus up to 80 curated source files (entry-points + index files prioritized). Extracts typed edges — imports, calls, reads, writes, extends — and emits one CausalEdge per line. Edge endpoints are required to be exact node ids from the manifest.
3. Semantic agent
Receives the same node manifest. Writes a one-sentence plain-English summary per node. Rules: starts with a verb, no "this file" preamble, under 140 characters.
4. Oracle agent
Receives all three upstream outputs, merges them into a final CausalGraph, deduplicates and drops orphan edges, fixes obvious miscategorized layers, and ensures every node has a summary.
Sequential, then parallel
Structure runs first because Dependency and Semantic both reference its node ids. Once Structure completes, Dependency and Semantic run concurrently against that node manifest. Oracle synthesizes after both settle:
We tried full three-way parallelism in an early version — Dependency would invent edge endpoints that didn't match Structure's ids, and the orphan-edge filter would drop almost everything. Sequencing Structure first eliminated that whole class of failure. Side benefit: it's a better demo — file structure forms first, then edges trace between the placed nodes.
A fuzzy ID resolver runs after Dependency completes, mapping any model-emitted endpoints (path strings, slashed-id variants, leading "./") back to canonical node ids before the orphan filter applies. AST verification (@babel/parser for JS/TS, line-scan for Python) then stamps each surviving edge with a verified boolean.
Cost budgeting
| Agent | Input shape | Typical tokens |
|---|---|---|
| Structure | file tree JSON (paths + sizes) | ~5k in, ~8k out |
| Dependency | node manifest + ≤80 source files (≤30KB each) | ~30k in, ~10k out |
| Semantic | node manifest | ~8k in, ~6k out |
| Oracle | merged outputs + schema | ~25k in, ~16k out |
Prompt caching cuts repeat costs to ~10% on subsequent runs against the same repo. First analyze of a medium repo is around $0.30 in Opus tokens; the post-build Ask agent (a Claude Managed Agent with the 11 graph-query tools as custom tools) typically resolves a question in 3–6 tool calls.