Docs · verification

Verification

Contradictions, DAG invariants, interventional edges

What we verify today

After the Oracle agent emits a graph, we run two real checks before handing it back to the browser:

  1. Edge-endpoint sanity. Every edge whose source or target isn't in the node list is dropped. This catches Oracle hallucinations where the model invents a node id that didn't appear in Structure's output.
  2. AST verification (src/lib/analyze/ast-verify.ts). For every edge, we parse the real source files using @babel/parser (JS/TS) or a Python import scan and check whether the edge is justified by a matching import / require / dynamic import(). Edges that pass get verified: true; edges that don't get rendered as faint dashed lines so the user (and any downstream agent) can tell what's grounded versus inferred.

That's it for the launch verifier. It's deliberately cheap — runs on every analyze, no extra LLM cost. The next two sections describe where this is going.

What we'd add next

Cheap to dream, harder to ship. The roadmap from here:

  • Mutation testing. Mutate a function, re-run the tests. Tests that newly fail tell you which functions are load-bearing for which tests — a real causal signal. Today we approximate this with reachability via calls edges, which is correct in shape but not in strength.
  • Commit history as evidence. A commit that touches auth.ts and breaks auth.test.ts is one data point. Aggregate across thousands of commits and you can rank which files actually break which tests, without having to mutate anything.
  • Agent self-review. When Claude Code edits the same file four times in a row without a passing test, it's stuck. A small pass over the graph + recent tool history could surface that and inject it back into context, similar to Reflexion.

We don't ship any of this today. The launch verifier is the AST check above. Everything else is honest-future-work, not honest-present.