Skip to content

The metrics

Every perturbation produces one pair: the clean baseline run vs. the perturbed run. All metrics are per-pair, aggregated in ExperimentResults.

d_norm: how much the process changed

Normalized edit distance between the paired traces' structural signatures: the minimum number of insertions, deletions, and substitutions to turn one event sequence into the other, normalized to [0, 1]. Signatures are structural: they never include argument values or content, so lexical noise does not saturate the metric.

  • 0.0: the agent did exactly the same thing.
  • 1.0: a completely different execution.

t*: where divergence began

The step at which the traces first diverge structurally, plus the normalized position t*/T (0 = diverged immediately, 1 = at the very end). Early divergence means the corruption changed the agent's plan, not just its final wording.

Manifestation: how the damage showed up

A rule-based taxonomy over the pair:

fine category meaning
silent_semantic_corruption answer changed, process identical (the scariest row)
strategy_reroute a different agent/node was chosen
early_termination the perturbed run gave up sooner
loop_or_extended_execution the run thrashed or ran long
catastrophic_failure null answer plus major disruption
structural_divergence_with_outcome_change different path, different answer
structural_divergence_recovered different path, same answer
no_observable_effect nothing changed

Each fine category rolls up to one of four manifestation groups: silent corruption, behavioral detours, combined disruption, no observable effect.

Token overhead

Perturbed-run tokens / baseline tokens, from captured usage. Corruption you pay for. External agents get this from the capture wrapper's captured usage; runs without usage data are reported as unavailable, never as 1.0.

The noise floor

LLM agents are not deterministic: sampling temperature, concurrency scheduling, and retrieval nondeterminism all produce divergence with no perturbation at all. noise_floor_runs re-runs the clean baseline; the baseline-vs-itself d_norm is the floor. Pairs at or below it are flagged within_noise_floor; report them as contamination at your peril.

Defaults: 1 re-run for external agents (do not turn it off), 0 for the deterministic built-in-agent-plus-stub path. summary() warns prominently whenever the floor is unmeasured for an external agent.