Docs · pipeline

Pipeline

Four Claude agents, sequenced for shared context

Four specialized agents

Causalist generates a graph by calling Claude Opus 4.7 in four roles. Each agent has its own system prompt constraining output to strict JSONL — one parseable JSON object per line, so the browser can stream nodes / edges into the live force-graph as they emit. Models are configurable per-agent via the New Project modal (defaults to Opus 4.7 for all four; Sonnet 4.6 / Haiku 4.5 also wired).

1. Structure agent

Walks the repo tree, classifies every file into one of seven layers — infra, data, logic, api, ui, test, config — and emits one CausalNode per line.

2. Dependency agent

Receives Structure's node manifest plus up to 80 curated source files (entry-points + index files prioritized). Extracts typed edges — imports, calls, reads, writes, extends — and emits one CausalEdge per line. Edge endpoints are required to be exact node ids from the manifest.

3. Semantic agent

Receives the same node manifest. Writes a one-sentence plain-English summary per node. Rules: starts with a verb, no "this file" preamble, under 140 characters.

4. Oracle agent

Receives all three upstream outputs, merges them into a final CausalGraph, deduplicates and drops orphan edges, fixes obvious miscategorized layers, and ensures every node has a summary.

Sequential, then parallel

Structure runs first because Dependency and Semantic both reference its node ids. Once Structure completes, Dependency and Semantic run concurrently against that node manifest. Oracle synthesizes after both settle:

Graph  =  Oracle(Struct(tree),  Dep(tree, nodes),  Sem(nodes))\text{Graph} \;=\; \text{Oracle}\big(\text{Struct}(\text{tree}),\; \text{Dep}(\text{tree, nodes}),\; \text{Sem}(\text{nodes})\big)

We tried full three-way parallelism in an early version — Dependency would invent edge endpoints that didn't match Structure's ids, and the orphan-edge filter would drop almost everything. Sequencing Structure first eliminated that whole class of failure. Side benefit: it's a better demo — file structure forms first, then edges trace between the placed nodes.

A fuzzy ID resolver runs after Dependency completes, mapping any model-emitted endpoints (path strings, slashed-id variants, leading "./") back to canonical node ids before the orphan filter applies. AST verification (@babel/parser for JS/TS, line-scan for Python) then stamps each surviving edge with a verified boolean.

Cost budgeting

AgentInput shapeTypical tokens
Structurefile tree JSON (paths + sizes)~5k in, ~8k out
Dependencynode manifest + ≤80 source files (≤30KB each)~30k in, ~10k out
Semanticnode manifest~8k in, ~6k out
Oraclemerged outputs + schema~25k in, ~16k out

Prompt caching cuts repeat costs to ~10% on subsequent runs against the same repo. First analyze of a medium repo is around $0.30 in Opus tokens; the post-build Ask agent (a Claude Managed Agent with the 11 graph-query tools as custom tools) typically resolves a question in 3–6 tool calls.