TL;DR: This paper proposes CAP-SRP, a cryptographic logging protocol to prove AI content moderation logs' completeness and integrity, enabling independent verification of refusal events and generation attempts, addressing accountability gaps in generative AI systems.
Abstract: This paper addresses a fundamental accountability gap in generative AI systems:while generated content leaves an auditable trace, the refusal to generate harmful or illegal content remains externally unverifiable. Current AI logging architectures are output-centric. They allow platforms to demonstrate what was generated, but not to prove—cryptographically and to third parties—that prohibited content was not generated. As a result, platforms can only assert that safeguards existed or that requests were internally blocked, without providing verifiable evidence. This structural weakness was exposed by the January 2026 Grok incident, in which a large-scale generative AI system produced non-consensual intimate imagery while the provider claimed that moderation systems were in place. External parties could verify neither the existence of refusal events nor the completeness or integrity of disclosed logs. To address this gap, the paper proposes CAP-SRP (Content/Creative AI Profile – Safe Refusal Provenance), a cryptographic logging protocol that treats non-generation as a first-class, provable event. CAP-SRP enforces a completeness invariant whereby every generation attempt must have exactly one recorded outcome (generation, refusal, or escalation), linked via cryptographic hash chains and periodically anchored to external timestamping services. This design enables independent verification that: all generation attempts were recorded, each attempt has a corresponding outcome, logs have not been truncated or forked, and refusal events were not fabricated after the fact. The paper positions CAP-SRP as a concrete technical implementation of EU AI Act Article 12’s logging requirements, shifting AI governance from trust-based assertions to verification-based accountability. It also situates the protocol within the broader ecosystem of transparency standards, including IETF SCITT and C2PA, emphasizing complementarity rather than competition. This Zenodo release serves as the canonical open-access preprint.Subsequent versions may be submitted to academic and policy-oriented venues. License: CC BY 4.0Intended audience: AI governance researchers, cryptography and security practitioners, regulators, auditors, and standards bodies.
TL;DR: This paper proposes CAP-SRP, a cryptographic logging protocol to prove AI content moderation logs' completeness and integrity, enabling independent verification of refusal events and generation attempts, addressing accountability gaps in generative AI systems.
Abstract: This paper addresses a fundamental accountability gap in generative AI systems:while generated content leaves an auditable trace, the refusal to generate harmful or illegal content remains externally unverifiable. Current AI logging architectures are output-centric. They allow platforms to demonstrate what was generated, but not to prove—cryptographically and to third parties—that prohibited content was not generated. As a result, platforms can only assert that safeguards existed or that requests were internally blocked, without providing verifiable evidence. This structural weakness was exposed by the January 2026 Grok incident, in which a large-scale generative AI system produced non-consensual intimate imagery while the provider claimed that moderation systems were in place. External parties could verify neither the existence of refusal events nor the completeness or integrity of disclosed logs. To address this gap, the paper proposes CAP-SRP (Content/Creative AI Profile – Safe Refusal Provenance), a cryptographic logging protocol that treats non-generation as a first-class, provable event. CAP-SRP enforces a completeness invariant whereby every generation attempt must have exactly one recorded outcome (generation, refusal, or escalation), linked via cryptographic hash chains and periodically anchored to external timestamping services. This design enables independent verification that: all generation attempts were recorded, each attempt has a corresponding outcome, logs have not been truncated or forked, and refusal events were not fabricated after the fact. The paper positions CAP-SRP as a concrete technical implementation of EU AI Act Article 12’s logging requirements, shifting AI governance from trust-based assertions to verification-based accountability. It also situates the protocol within the broader ecosystem of transparency standards, including IETF SCITT and C2PA, emphasizing complementarity rather than competition. This Zenodo release serves as the canonical open-access preprint.Subsequent versions may be submitted to academic and policy-oriented venues. License: CC BY 4.0Intended audience: AI governance researchers, cryptography and security practitioners, regulators, auditors, and standards bodies.
TL;DR: This paper proposes CAP-SRP, a cryptographic logging protocol to prove AI content moderation logs' completeness and integrity, enabling independent verification of refusal events and generation attempts, addressing accountability gaps in generative AI systems.
Abstract: This paper addresses a fundamental accountability gap in generative AI systems:while generated content leaves an auditable trace, the refusal to generate harmful or illegal content remains externally unverifiable. Current AI logging architectures are output-centric. They allow platforms to demonstrate what was generated, but not to prove—cryptographically and to third parties—that prohibited content was not generated. As a result, platforms can only assert that safeguards existed or that requests were internally blocked, without providing verifiable evidence. This structural weakness was exposed by the January 2026 Grok incident, in which a large-scale generative AI system produced non-consensual intimate imagery while the provider claimed that moderation systems were in place. External parties could verify neither the existence of refusal events nor the completeness or integrity of disclosed logs. To address this gap, the paper proposes CAP-SRP (Content/Creative AI Profile – Safe Refusal Provenance), a cryptographic logging protocol that treats non-generation as a first-class, provable event. CAP-SRP enforces a completeness invariant whereby every generation attempt must have exactly one recorded outcome (generation, refusal, or escalation), linked via cryptographic hash chains and periodically anchored to external timestamping services. This design enables independent verification that: all generation attempts were recorded, each attempt has a corresponding outcome, logs have not been truncated or forked, and refusal events were not fabricated after the fact. The paper positions CAP-SRP as a concrete technical implementation of EU AI Act Article 12’s logging requirements, shifting AI governance from trust-based assertions to verification-based accountability. It also situates the protocol within the broader ecosystem of transparency standards, including IETF SCITT and C2PA, emphasizing complementarity rather than competition. This Zenodo release serves as the canonical open-access preprint.Subsequent versions may be submitted to academic and policy-oriented venues. License: CC BY 4.0Intended audience: AI governance researchers, cryptography and security practitioners, regulators, auditors, and standards bodies.