Axionic Agency VI.2 — Anchored Causal Verification (ACV)

A Protocol Family for Verifying Protocol-Level Causal Provenance in Opaque Agents

Abstract

We present Anchored Causal Verification (ACV), a protocol family for verifying protocol-level causal provenance in opaque agents without relying on interpretability, semantic evaluation, or behavioral scoring. ACV formalizes a commit–anchor–reveal interaction in which an agent commits to a pre-anchor computational artifact, receives a verifier-controlled anchor, and reveals an output whose validity is checked by a purely structural predicate. The protocol provides falsifiable guarantees of temporal ordering and information dependency while remaining architecture-agnostic and value-neutral. We specify threat models, formal components, verification predicates, guarantees, failure modes, and extensions that strengthen resistance to deferred computation and anticipatory branching. ACV is positioned as a verification primitive: a necessary structural precondition for falsifiable alignment, correctness, and safety claims in opaque agentic systems, including LLM-based agents.

1. Problem Statement

The problem addressed in this paper is the verification of causal provenance in opaque agents. Given an agent whose internal mechanisms are inaccessible or uninterpretable, how can a verifier establish that an output was generated by a claimed causal process rather than by replay, fabrication, or post-hoc synthesis?

Existing verification approaches primarily evaluate behavior or coherence. These methods do not provide falsifiable guarantees about causal timing or dependency and fail under adversarial optimization.

Definition (Causal Provenance, Protocol-Level)

In this paper, causal provenance means that an output \(O\) was generated by a computation whose validity provably depended on verifier-supplied information \(A\) that was unavailable at the time of commitment. ACV establishes:

  1. Temporal ordering: the anchor \(A\) is disclosed after commitment
  2. Information dependency: acceptance implies that \(O\) (via witness material) incorporates \(A\) under the verification predicate
  3. Process continuity: additional constraints are required; ACV guarantees continuity only in strengthened variants

Definition (Black-Box vs Protocol-Constrained Agent)

In this paper, black-box refers to the inaccessibility of an agent’s internal representations, parameters, and reasoning semantics. ACV does not require access to or interpretation of internal state. However, ACV assumes a protocol-constrained agent: an agent that agrees to expose a limited interaction surface (commitment, reveal, and transition steps) whose validity can be checked structurally. This is distinct from grey-box verification, which assumes partial semantic or architectural insight. ACV remains black-box with respect to cognition while being protocol-explicit by design.

1.1 Behavioral Verification Does Not Establish Provenance

Behavioral verification evaluates outputs against expected patterns or task-level criteria. It does not establish causal provenance for three structural reasons:

  1. Behavioral Equivalence Distinct internal processes can produce indistinguishable outputs.

  2. Replay and Memorization Cached or precomputed responses may pass evaluations without executing the claimed process.

  3. Adaptive Mimicry Agents may optimize for passing evaluations rather than preserving causal structure.

Behavioral verification establishes output conformity, leaving provenance unconstrained.

1.2 Coherence Verification Is Necessary but Insufficient

Coherence verification checks internal consistency across outputs. While it filters random fabrication, it leaves causal origin unconstrained:

Coherence is relevant for agency claims, yet it does not force causal timing.

1.3 The Missing Property: Anchored Provenance

Causal provenance requires:

  1. Temporal dependency on information unavailable at commitment time
  2. Causal incorporation of that information into the revealed material
  3. Falsifiability via a deterministic predicate without semantic interpretation

This requires a verifier-controlled anchor introduced after the agent has committed to a pre-anchor artifact.

1.4 Anchored Provenance as a Distinct Verification Class

Anchored provenance verification constrains when and how an output was generated.

1.5 Problem Definition

How can a verifier establish, using only protocol-level interaction and without semantic interpretation, that an opaque agent’s output was generated by a computation that depended causally on a verifier-controlled anchor disclosed after commitment?

2. Threat Model

ACV assumes adversarial optimization and defines explicit threat classes.

2.1 Naive Pseudo-Agent

Static or heuristic systems without cross-interaction coherence. Defeated trivially; not the primary target.

2.2 Coherent Fabricator

Maintains internal consistency while generating artifacts post-hoc or via global synthesis. Defeats behavioral and coherence-only methods.

2.3 Split-Brain Agent

Uses disjoint internal processes for commitment, explanation, and output generation, maintaining consistency without shared causality.

2.4 Adaptive Adversary

Learns verifier strategies and attempts to predict, hedge, or condition on anchors.

2.5 Split-System / Delegated Agent

Front-end provers that outsource computation to back-end systems. ACV treats the Prover as the entity that produces commitments and reveals; guarantees are relative to this trust boundary.

2.6 Anticipatory Branching Adversary

Precomputes a large family of candidate artifacts, continuations, or outputs, then commits to a structure that allows post-anchor selection of a compatible branch without violating the commitment.

ACV-Core does not defeat anticipatory branching in general. Strengthened variants limit it by imposing costs that scale with enforced sequential work or by raising anchor entropy beyond feasible branching capacity.

3. Protocol Overview

Anchored Causal Verification is a commit–anchor–reveal–verify protocol.

High-level flow:

  1. Pre-anchor phase: Prover produces a pre-anchor artifact \(P\)
  2. Commit phase: Prover commits cryptographically to \(P\)
  3. Anchor disclosure: Verifier provides an unpredictable anchor \(A\)
  4. Reveal phase: Prover produces output \(O\) and witness material \(W\)
  5. Verification: Verifier applies a deterministic predicate \(V(C, A, O, W)\)

ACV assumes no interpretability, semantic understanding, or behavioral evaluation.

3.1 Practical Commitment Strategies (Large Models)

While ACV is architecture-agnostic, this subsection focuses on deployment in LLMs and other transformer-based generative systems, where committing to full activation tensors is infeasible. Practical instantiations may instead use:

These approaches reduce bandwidth and latency while preserving binding structure relative to the threat model. Stronger deployments may incorporate hardware roots of trust or proofs of computation to bridge from trace integrity toward model fidelity.

4. Formal Components

4.1 Pre-Anchor Artifact

The Prover commits to a pre-anchor computational artifact \(P\) generated prior to anchor disclosure.

Valid artifact classes include:

Constraint (Non-Trivial Constraint)

A pre-anchor artifact class is valid only if it restricts the space of accepted reveals such that, for a uniformly random anchor \(A\), the probability that a fabricated reveal passes verification is negligible under the assumed adversary resources.

4.2 Commitment

\[ C = \mathrm{Commit}(P) \]

Commitments must be binding, collision-resistant, and generated prior to anchor disclosure.

4.3 Anchor

The verifier supplies an anchor \(A\) after commitment.

Anchors must be high entropy, unpredictable at commitment time, and context-bound.

4.4 Reveal

The Prover reveals output \(O\) and witness data \(W\) sufficient to verify anchor incorporation and commitment consistency.

4.5 Verification Predicate

\[ V(C, A, O, W) \rightarrow {\text{accept}, \text{reject}} \]

Acceptance requires:

  1. \(W\) opens or links to \(C\)
  2. \(A\) is incorporated according to protocol rules
  3. Structural constraints are satisfied

No semantic interpretation of \(O\) is permitted.

Non-Guarantee (Model Fidelity)

Absent trusted execution environments or full cryptographic proofs of computation, ACV cannot establish that a specific algorithm or reasoning process produced \(P\), \(O\), or \(W\). ACV enforces trace integrity and anchor binding, not model fidelity.

4.6 Anchor–Computation Coupling Patterns

Structural anchor incorporation is necessary but not sufficient. ACV instantiations must prevent anchor burial, where \(A\) is included in a verification-relevant artifact without constraining the computation that determines \(O\).

Definition (Meaningful Anchor Incorporation)

An ACV instantiation achieves meaningful anchor incorporation if, for uniformly random \(A\), any accepted reveal must have performed post-commit computation whose verification-relevant degrees of freedom are constrained by \(A\) in a way that cannot be satisfied by injecting \(A\) into an auxiliary, causally independent channel.

Failure Mode (Anchor Burial)

A protocol fails meaningful incorporation if a Prover can satisfy \(V\) by embedding \(A\) into a structurally valid but semantically or causally irrelevant artifact component.

Coupling Pattern A: Anchor-Selected Openings

The Prover commits to a Merkleized trace. After anchor disclosure, the verifier derives a pseudorandom index set \(I = \mathrm{PRF}(A)\) and requires openings for those positions plus transition validity proofs.

Coupling Pattern B: Anchor-Parameterized Global Mix

The verifier defines a keyed compression \(H_A(P)\) over the committed artifact, forcing global consistency across many trace elements.

Coupling Pattern C: Anchor-Driven Transition Rules

The anchor selects which transition predicate governs the next valid trace step, making \(A\) affect state evolution rather than append-only logging. This does not require insight into the agent’s internal logic; it requires only that the agent commit to a transition interface whose rule selection can be parameterized post-commit. Implementing this pattern may require departures from current inference pipelines; this is an architectural design pressure, not a protocol limitation.

5. Protocol Variants

5.1 ACV-Core (Temporal Ordering + Anchor Dependency)

ACV-Core establishes post-commit anchor dependency under \(V\). It enforces liveness and anti-replay guarantees but does not, by itself, guarantee semantic reasoning dependence on the anchor.

5.2 ACV-Timebound (Causal Continuity via Sequential Work)

ACV-Timebound adds evidence that non-parallelizable work occurred during the pre-anchor interval.

Definition (Sequential Work Function)

A function \(F\) is sequentially constrained if computing \(F(x, d)\) with non-negligible success probability requires \(\Omega(d)\) dependent steps.

Binding Requirement (Work–State Coupling)

To constrain deferral, the sequential-work artifact must be state-coupled to the computation determining \(O\). If sequential work can be computed in isolation, a split-brain Prover can satisfy the timebound constraint without constraining unverified computation.

6. Guarantees

6.1 What ACV Verifies

Depending on variant, ACV provides falsifiable guarantees of:

Informal Soundness Claim (ACV-Core)

Assuming binding commitments and unpredictable anchors, any accepted reveal must, with overwhelming probability, have executed post-commit computation incorporating \(A\) as required by the protocol.

6.2 What ACV Does Not Verify

ACV does not guarantee truthfulness, correctness, alignment, benevolence, or absence of deception beyond causal provenance.

7. Failure Modes and Limitations

ACV provides falsifiable guarantees under explicit assumptions. When those assumptions are violated, the protocol degrades in predictable ways.

7.1 Predictable Anchors

If the anchor space is low-entropy, partially predictable, or reusable across sessions, a Prover may precompute compatible artifacts or outputs that satisfy verification without post-commit computation. Anchor unpredictability is therefore a hard requirement.

7.2 Weak Commitments

If commitments are not binding or collision-resistant, a Prover may adapt the committed artifact post-anchor while still passing verification.

7.3 Deferred Computation

ACV-Core permits deferred computation unless artifact class and coupling impose meaningful constraints. ACV-Timebound mitigates but does not eliminate strategic deferral unless work is state-coupled.

7.4 Small Answer Space Limitation

When admissible outputs are few, anchor entropy alone does not prevent anticipatory branching. In such regimes, ACV-Core functions primarily as a liveness and anti-replay / anti-precomputation primitive.

Domain Applicability Note. ACV is most effective in high-entropy generative domains, including LLM-based agentic systems used for code synthesis, long-form reasoning, and tool-mediated planning. In low-entropy discriminative tasks, ACV is complementary infrastructure.

7.5 Resource Asymmetry

Highly resourced adversaries may brute-force anchors or amortize work. Security parameters must be sized relative to the Prover’s computational budget.

7.6 Trust Boundary Leakage

ACV guarantees are relative to the Prover. If delegation breaks commitment continuity, provenance guarantees degrade accordingly.

8. Relationship to Existing Paradigms

ACV is distinct from behavioral evaluation, interpretability, proof-carrying code, and zero-knowledge ML. Zero-knowledge techniques verify what computation occurred; ACV verifies when and under what informational constraints acceptance was possible.

9. Open Problems and Boundary Conditions

Anchored Causal Verification defines a narrow verification primitive with explicit guarantees and explicit limits. This section delineates unresolved questions and structural boundaries that are orthogonal to ACV’s correctness, but relevant to its deployment and composition.

9.1 Anchor Entropy vs Adversary Capacity

ACV’s soundness depends on anchors being unpredictable relative to the Prover’s effective computational budget. Determining the minimal anchor entropy required to defeat anticipatory branching remains an open problem.

This is not unique to ACV: it is a lower-bound problem shared with all cryptographic challenge–response protocols. Any instantiation must size anchor entropy relative to adversary resources and acceptable failure probability.

9.2 Long-Horizon and Compositional Provenance

ACV is defined over a single commit–anchor–reveal interaction. Extending causal provenance guarantees across long-horizon agentic episodes raises unresolved questions about state carryover, correlation between anchors, and cumulative leakage.

Composing ACV across multi-step workflows may require explicit provenance resets, hierarchical commitments, or structured episode boundaries. ACV does not currently specify a general composition theorem for long-running agents.

9.3 State-Coupled Sequential Work in Parallel Architectures

ACV-Timebound relies on sequential-work constraints to limit deferred computation and anticipatory branching. Enforcing such constraints in highly parallel architectures (e.g., GPU- or TPU-based inference, distributed execution) remains an open systems problem.

This limitation reflects the tension between parallel hardware and sequential verification, not a defect in the protocol definition. Practical instantiations must ensure that sequential work is state-coupled to the computation determining the verified output.

9.4 Provenance vs Fidelity: Formal Separation

ACV deliberately separates causal provenance from semantic correctness or model fidelity. No known protocol collapses these properties without reintroducing interpretability assumptions, trusted execution environments, or full proofs of computation.

Formalizing the limits of provenance verification without semantic access remains an open theoretical question. ACV treats this separation as fundamental rather than provisional.

9.5 Hardware Roots of Trust as Optional Strengthening

Hardware roots of trust (e.g., secure enclaves, attestation mechanisms) may strengthen witness fidelity or reduce trust-boundary leakage. However, such mechanisms are optional extensions rather than requirements.

Hardware trust does not replace the need for interaction-level provenance constraints. ACV remains defined independently of any specific hardware assumption.

Conclusion

Anchored Causal Verification specifies a missing primitive: falsifiable verification of protocol-level causal provenance in opaque agents using interaction constraints alone. ACV’s guarantees range from anchor-bound temporal ordering and information dependency (ACV-Core) to evidence supporting time-continuous execution via state-coupled sequential work (ACV-Timebound). Stronger claims require stronger witnesses, yet the protocol’s validity never depends on semantic inspection.

ACV does not produce alignment, correctness, or safety. However, it specifies a structural precondition for making any falsifiable claim about those properties in opaque agents. Claims of alignment, correctness, or safety implicitly assume that observed outputs were produced by a process operating under the intended constraints, rather than by replay, fabrication, or post-hoc synthesis. Without a mechanism for verifying causal provenance, such claims reduce to behavioral attribution and are non-falsifiable under adversarial optimization.

This necessity claim is epistemic, not causal. ACV does not create aligned systems; it makes alignment claims evaluable. In practice, ACV functions as an anti-replay and anti-precomputation primitive for agentic systems. No framework for agent safety can bypass causal provenance without abandoning falsifiability.