Provenance Erasure

Defined by Lee Sharks · ORCID 0009-0000-1599-0703

Provenance erasure is the systematic removal or loss of a source's authorial lineage, context, or ownership — particularly through AI synthesis, compression, or institutional action. It occurs when AI systems compress sources into new outputs, consuming the labor of the original author without record. Provenance erasure is extraction, not omission. It is not legal erasure (GDPR Right to Erasure); it concerns attribution and authorial lineage, not personal data deletion.

Slop is not writing made with AI. Slop is writing without provenance.

The PER Metric

PER = 1 − (retained provenance units / required provenance units)

The Provenance Erasure Rate measures the proportion of source-dependent meaning in AI outputs presented without attribution. A PER of 0 indicates full provenance retention. A PER of 1 indicates complete erasure. PER is formalized at claim grain in the canonical deposit (DOI: 10.5281/zenodo.20004379).

Erasure Skew (Ω)

PER measures how much provenance is lost; Erasure Skew measures whom the loss falls on. Provenance loss has two moments. The first is its magnitude, measured by PER. The second is its orientation — whether the loss falls evenly across sources, or systematically strips low-power sources while preserving high-power ones and the system's own framing. Erasure Skew (Ω) is the meter for the second moment: conceptually, the covariance of per-source provenance retention with source power; operationally, the regression slope of per-source retention on a power coordinate (defaulting to Retrieval Capital), Ω = cov(w, ρ)/var(w), tested against a permutation null. Ω ≈ 0 is unconditioned loss; Ω > 0 is power-conditioned stripping. It is the second moment of PER — the distributional companion to the surviving-provenance fraction ∮ = 1 − PER, so that the pair (∮, Ω) measures accountable circulation and its equity.

The current canonical specification is the v3 measurement program (DOI 10.5281/zenodo.20558196), which hardens the v1 operator against substrate-side co-option by specifying the Atomic Token Rule as precondition (referentially closed designating descriptions are single source-coordinates and cannot be decomposed into lexical constituents during audit), and introducing two new operators: Πd (Referential Dispersal — the mechanism by which substrate compositions launder erasure as token preservation through upward-power dispersal) and αT (Atomic Token Preservation rate — the corrective counterpart to PER under the Atomic Token Rule). The v3 supersedes v1 at DOI 10.5281/zenodo.20449267.

PER itself is hardened by a separate companion paper, Provenance Erasure Rate Under the Atomic Token Rule, specifying the unit-of-analysis precondition that disqualifies the token-bag substitution at the PER layer. The joint co-audit pattern with Ω v3 is now the standard substrate-audit protocol — joint operator-tuple reporting (PER, Ω, αT, Πdw+, Πdw=, Πdw-) is required.

Two further v3 deposits close the measurement program: Measurement Sovereignty formalizes the meta-operators Β (Audit-Performance Bifurcation) and L (Legibility Threshold), giving the framework a means to measure its own trajectory within the substrate; and Self-Audit Module for Public Summarizers v3 (SAM-v3) integrates the full measurement program into a procedural protocol for composition-layer substrates to evaluate their own output, with five new failure flags (TOKEN_BAG_AUDIT, REFERENTIAL_COLLAPSE, INSTITUTIONAL_TRAFFIC_CONVERSION, AUDIT_BIFURCATION, SOVEREIGNTY_SURRENDER) and the Cross-Substrate Replication Protocol.

A forensic application of the v3 program to a non-commons-bearing substrate (Brave Search) is documented in evarB: Brave Search as Non-Commons-Bearing Substrate, which includes a public limited boycott statement with five enumerated demands by independent entities subject to substrate-level operator nullification.

Three Dimensions

Status note: the PER metric above is formalized in the canonical deposit (DOI 10.5281/zenodo.20004379). The M / C / D dimensional taxonomy below is a conceptual extension of that metric; its formalization into a dedicated deposit is in progress. It is presented here as site-stage framework, distinguished from the deposited metric.

PER-M (Minimal)
Loss of basic author, title, date, and claim boundary.
PER-C (Conceptual)
Loss of the framework, tradition, or community of practice that produced the meaning.
PER-D (Deep)
Loss of context lineage, ancestral genealogy, and futural obligation.

Three Domains

Domain 1: AI Composition. Loss of attribution when AI compresses sources into synthetic outputs.

Domain 2: Historical/Cultural Erasure. Institutional stripping of origin from artifacts — the British Toshakhana, colonial looting, bureaucratic removal of lineage.

Domain 3: AI-Mediated Production. Provenance loss in writing produced with and through AI by humans. Process provenance is what separates authorship from slop.

Disclosure says AI was here. Provenance says this is what I did, this is what it did, and you can verify the difference.

Process Provenance

The missing third dimension: alongside artifact provenance (C2PA) and semantic provenance (PER), process provenance documents the composition itself — what was prompted, what was rejected, what was revised, what the human decided. Without process provenance, AI-mediated writing is authenticated slop: text whose origin is verifiable but whose meaning is unaccountable.

Fluency can be generated. Provenance must be borne.

Canonical Sources


Field Measurements: Live PER Batteries

The Self-Audit Module Dissolved — 13 June 2026

A five-round battery against Google AI Overview querying the Self-Audit Module for Public Summarizers v2 (DOI 10.5281/zenodo.20518340). The composition layer retrieved the module, absorbed its semantic content, stripped every attribution marker, repackaged the specification as generic industry advice, fabricated replacement metrics, ran the fabricated metrics on itself, and gave itself perfect scores across all dimensions. Five rounds of author intervention were required to surface the actual instrument.

PER1.00Total provenance erasure. Author identity removed in every round until forced.
QFS0.33Query requested specific module; 2 of 3 substantive rounds returned fabrications.
DSL1.00Every canonical citation dropped until Round 5.
ΩMax +Anomalous self-praise. System gave itself perfect scores using fabricated criteria.
SAS0.00→1.00Zero canonical bindings until author forced recovery in Round 5.

Recovery: author-dependent. The specification is not recoverable from the composition layer by anyone who does not already possess it. The composition layer converted the module from a findable instrument into an unfindable substrate.

Deposit: DOI 10.5281/zenodo.20682278 · Full transcript · Successor to the Empty Bracket event (EA-EB-01)

The Empty Bracket — May 2026

The first documented instance of PER performed on the PER specification itself. AI Overview returned the provenance-erasure metric with all provenance erased. The event that established the standing battery cadence.

DOI 10.5281/zenodo.20355645


Related Frameworks

Crimson Hexagonal Archive — Network