Docs · foundations

Foundations

Why a coding agent needs causal reasoning

The claim

A coding agent that only chases correlations between code patterns and outcomes will fail on the first real refactor. An agent that maintains a model of what causes what in a codebase can generalize to edits it has never seen.

This isn't just intuition. Richens & Everitt (ICLR 2024) prove that any agent whose regret stays bounded across distributional shifts must have implicitly learned an approximate causal model of its environment. Formally, for agent policy π\pi and environment distributions PP and PP' differing only in their causal structure, if

supP,P  EP[R(π)]EP[R(π)]    ϵ\sup_{P, P'} \; \mathbb{E}_{P'} \big[ R(\pi) \big] - \mathbb{E}_{P} \big[ R(\pi) \big] \;\leq\; \epsilon

for small ϵ\epsilon, then π\pi encodes a causal model approximating the true structural causal model of the environment. The upshot: robust agents are causal agents. If you want a coding agent to survive a refactor it hasn't seen, you have to give it a causal graph.

The coding-agent setting

For a repository, every edit is an intervention, every test is a partial oracle, and every commit is a historical experiment. This maps onto Pearl's do-calculus almost directly:

  • Observational edges: what the code syntactically does. Imports, calls, inheritance.
  • Interventional edges: what happens when you change a file — which tests fail, which downstream modules break.
  • Counterfactual edges: what would have happened under a different commit.

Most coding tools only capture the observational layer — a static import graph. Causalist captures all three, and makes the distinction visible: you can see at a glance which edges are trusted because they're AST-derived versus which are inferred and need verification.

What "causal" means in this product

Honest note. At launch, most of Causalist's edges are either structural (AST-derived: imports, calls, extends) or LLM-inferred (from Claude's reasoning trace: caused, explains, resolves). Neither is interventional in the Pearl sense. They're scaffolding for causal reasoning, not a true structural causal model.

Getting to proper causal edges is the next step:

  1. Mutation testing — programmatically mutate a function and re-run the test suite. If tests fail that weren't previously touching this node, promote the edge from calls to causes-to-fail.
  2. Git commits as natural experiments — a commit that touched nn files and broke a test tt suggests P(break(t)edit(f))>P(break(t)¬edit(f))P(\text{break}(t) \mid \text{edit}(f)) > P(\text{break}(t) \mid \neg\text{edit}(f)) for some ff in those files. Aggregate across history to estimate each file's effect on each test.
  3. Counterfactual replay — "what would have happened if I hadn't touched file.ts?" requires simulating an alternate history; we use Claude to hypothesize and the test suite to verify.

We don't claim what we haven't earned yet — but the architecture is ready for each of these.