Docs · verification
Verification
Contradictions, DAG invariants, interventional edges
What we verify today
After the Oracle agent emits a graph, we run two real checks before handing it back to the browser:
- Edge-endpoint sanity. Every edge whose source or target isn't in the node list is dropped. This catches Oracle hallucinations where the model invents a node id that didn't appear in Structure's output.
- AST verification (
src/lib/analyze/ast-verify.ts). For every edge, we parse the real source files using@babel/parser(JS/TS) or a Python import scan and check whether the edge is justified by a matchingimport/require/ dynamicimport(). Edges that pass getverified: true; edges that don't get rendered as faint dashed lines so the user (and any downstream agent) can tell what's grounded versus inferred.
That's it for the launch verifier. It's deliberately cheap — runs on every analyze, no extra LLM cost. The next two sections describe where this is going.
What we'd add next
Cheap to dream, harder to ship. The roadmap from here:
- Mutation testing. Mutate a function, re-run the tests. Tests that newly fail tell you which functions are load-bearing for which tests — a real causal signal. Today we approximate this with reachability via
callsedges, which is correct in shape but not in strength. - Commit history as evidence. A commit that touches
auth.tsand breaksauth.test.tsis one data point. Aggregate across thousands of commits and you can rank which files actually break which tests, without having to mutate anything. - Agent self-review. When Claude Code edits the same file four times in a row without a passing test, it's stuck. A small pass over the graph + recent tool history could surface that and inject it back into context, similar to Reflexion.
We don't ship any of this today. The launch verifier is the AST check above. Everything else is honest-future-work, not honest-present.