File size: 9,305 Bytes
023d99b
 
 
d6d06cf
 
 
 
 
33c9d58
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
023d99b
 
33c9d58
 
 
 
d6d06cf
 
 
 
33c9d58
 
9f707ae
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d6d06cf
 
 
9f707ae
 
 
 
 
 
 
 
 
 
33c9d58
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
023d99b
 
 
d6d06cf
 
33c9d58
 
 
023d99b
33c9d58
d6d06cf
023d99b
33c9d58
d6d06cf
 
 
023d99b
d6d06cf
33c9d58
 
 
 
 
023d99b
 
 
 
 
33c9d58
 
d6d06cf
33c9d58
 
023d99b
33c9d58
023d99b
 
33c9d58
9f707ae
 
 
 
 
 
023d99b
9f707ae
 
 
 
 
 
 
 
 
 
33c9d58
 
023d99b
d6d06cf
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
import gradio as gr

SUITE = [
    (
        "AuditPlane — LLM Decision Proofs",
        "https://huggingface.co/spaces/RFTSystems/AuditPlane__LLM_Decision_Proofs",
        "Signed verification plane: Ed25519-signed decision receipts + hash-chained runs + replay + drift diffs + Merkle proofs.",
    ),
    (
        "ReplayProof Agent POV Verified Replay",
        "https://huggingface.co/spaces/RFTSystems/ReplayProof__Agent_POV__Verified_Replay",
        "Fast proof: generate a deterministic run bundle you can verify and replay anywhere.",
    ),
    (
        "Agent Flight Recorder",
        "https://huggingface.co/spaces/RFTSystems/Agent_Flight_Recorder",
        "Chain-of-custody logging: hash-chained events across prompts, tools, outputs, and memory reads/writes.",
    ),
    (
        "RFT Memory Receipt Engine",
        "https://huggingface.co/spaces/RFTSystems/RFT_Memory_Receipt_Engine",
        "Proof layer: generate/download tamper-evident receipts; upload to independently verify integrity.",
    ),
    (
        "TimelineDiff Differential Reproducibility",
        "https://huggingface.co/spaces/RFTSystems/TimelineDiff__Differential_Reproducibility",
        "First divergence: align two run bundles and pinpoint exactly where/why they split.",
    ),
    (
        "TrustStack Console",
        "https://huggingface.co/spaces/RFTSystems/TrustStack_Console",
        "Audit cockpit: inspect runs, compare state, and trace exactly what changed and why.",
    ),
    (
        "Coherent Compute Engine",
        "https://huggingface.co/spaces/RFTSystems/Coherent_Compute_Engine",
        "Verification-first benchmark: live throughput + stability/energy behaviour + downloadable receipt.",
    ),
]

WHY = (
    "AI is being shipped into real systems faster than teams can reliably reproduce or explain agent behaviour. "
    "When an agent fails, too many postmortems still rely on screenshots, partial logs, and opinions — not evidence.\n\n"
    "The operational risk is not only that an agent does the wrong thing. The deeper risk is that **nobody can prove what happened**: "
    "what the system saw, what it decided, what it called, what it wrote, and where the run diverged. When failures are unreproducible, accountability collapses.\n\n"
    "RFTSystems exists to make behaviour **inspectable and independently verifiable**. This suite produces evidence bundles you can share and validate: "
    "Ed25519-signed receipts, hash-chained timelines, deterministic replays, Merkle proofs, and first-divergence diffs. You don’t need to trust the author — you can verify the evidence.\n\n"
    "I can’t promise “AI will never take over.” No one can. What I *can* promise is this: **with chain-of-custody logs and signed receipts, we can prove what happened and who is responsible.**"
)

WHY_VERIFICATION_DOC = (
    "# Why verification matters (the risks, plainly)\n\n"
    "AI is being built and deployed at a pace that is now outstripping accountability. That mismatch is where harm happens.\n\n"
    "The problem isn’t that agents make mistakes. Mistakes are inevitable. The unacceptable part is what usually follows:\n\n"
    "- “We can’t reproduce it.”\n"
    "- “We’re not sure which prompt/tool/model version caused it.”\n"
    "- “We changed a few things and it seems better now.”\n"
    "- “Trust us.”\n\n"
    "That is not engineering. That is damage control.\n\n"
    "## What must be provable (every time)\n\n"
    "If you’re shipping agents that browse, call tools, write files, automate actions, or influence real users, you need to be able to prove:\n\n"
    "1) **WHEN** it happened (a verifiable timeline)\n"
    "2) **WHAT** happened (inputs → decisions → tool calls → outputs)\n"
    "3) **WHY** it happened (the exact chain of state transitions)\n"
    "4) **HOW** to stop it happening again (what changed, and proof that the change works)\n\n"
    "If you cannot answer those with evidence, you do not have a safe system — you have a black box.\n\n"
    "## Why this collection exists\n\n"
    "This suite exists to end the “unanswered for” failure mode.\n\n"
    "It turns runs into **evidence you can verify independently**:\n\n"
    "- Ed25519-signed receipts (so outputs are attestations, not vibes)\n"
    "- Merkle proofs (so you can verify inclusion without shipping everything)\n"
    "- deterministic replays (so anyone can reproduce behaviour)\n"
    "- chain-of-custody logging (so the record can’t be quietly rewritten)\n"
    "- first-divergence diffs (so you can pinpoint exactly where and why two runs split)\n"
    "- audit views (so governance becomes evidence-led, not opinion-led)\n\n"
    "### Bottom line\n\n"
    "**If you can’t replay it, you don’t understand it. If you can’t prove it, you can’t govern it.**\n\n"
    "Collection:\n"
    "https://huggingface.co/collections/RFTSystems/rftsystems-agent-forensics-suite\n"
)

LICENSE_NOTICE = """All materials contained in or associated with this repository — including but not limited to text, code, algorithms, equations, figures, datasets, and documentation — are original works authored by Liam Grinstead and form part of the Rendered Frame Theory (RFT) research framework.

These works are protected under the following laws and treaties:

• Copyright, Designs and Patents Act 1988 (UK) — ss.1–103 (copyright subsistence, ownership, and infringement) and ss.77–89 (moral rights).
• Trade Secrets (Enforcement etc.) Regulations 2018 (UK) — Regs.2–6 (protection of confidential know-how, algorithms, and unpublished research).
• Copyright and Rights in Databases Regulations 1997 (UK) — Regs.14–24 (protection of compiled datasets).
• Berne Convention for the Protection of Literary and Artistic Works (1886) — Arts.5(2) & 6bis (automatic international copyright and moral rights).
• TRIPS Agreement (1994) — Arts.9–14 (international enforcement of copyright and related rights).

All rights are reserved.

No part of this work may be copied, reproduced, distributed, performed, displayed, trained upon by AI systems, reverse-engineered, or used to create derivative works without the author’s explicit written consent.

Enforcement rights: Unauthorised use constitutes infringement under CDPA 1988 ss.16 & 96–103, giving rise to civil remedies (injunctions, damages, delivery-up, account of profits, and costs recovery).
Commercial infringement may amount to a criminal offence under CDPA s.107, punishable by fines and/or imprisonment.

Verification: Each record is timestamped through the Zenodo/DataCite registry and may reference the master DOI: https://doi.org/10.5281/zenodo.17460107 as the consolidated legal and authorship archive.

© 2025 Liam Grinstead — All Rights Reserved.
"""


def _build_markdown() -> str:
    md = []
    md.append("# RFTSystems — Agent Forensics Suite")
    md.append("**Evidence-first instrumentation for AI agents and safety decisions.**")
    md.append("Audit, prove, replay, and diff runs — turning “trust me” into verification.")
    md.append("")
    md.append("## Why I built this")
    md.append(WHY)
    md.append("")
    md.append("## The workflow")
    md.append("**learn → generate proof → record reality → seal it → replay → diff → audit → benchmark**")
    md.append("")
    md.append("### Quick start (60 seconds)")
    md.append("1. Open **AuditPlane** and generate a baseline suite.")
    md.append("2. Replay the same suite and confirm drift diffs (should be 0 if unchanged).")
    md.append("3. Export the offline bundle — anyone can verify receipts and Merkle proofs.")
    md.append("")
    md.append("### Agent pipeline (real systems)")
    md.append("1. **Record reality** (Agent Flight Recorder).")
    md.append("2. **Seal it** into receipts (RFT Memory Receipt Engine).")
    md.append("3. **Diff** two runs and find first divergence (TimelineDiff).")
    md.append("4. **Audit** state transitions and governance evidence (TrustStack).")
    md.append("5. **Benchmark** verifiable performance signals (Coherent Compute Engine).")
    md.append("")
    md.append("## The labs")
    for name, url, desc in SUITE:
        md.append(f"- **[{name}]({url})** — {desc}")
    md.append("")
    md.append("## Design principle")
    md.append(
        "We don’t ‘hand-wave’ agent safety. We measure drift from declared intent and produce evidence. "
        "Enforcement remains an operator decision; this suite is the instrumentation layer."
    )
    md.append("")
    md.append("**Tags:** #Agents #LLMOps #MLOps #AISafety #Reproducibility #Forensics #Security #Governance")
    return "\n".join(md)


def render_doc(which: str) -> str:
    if which == "Why verification matters":
        return WHY_VERIFICATION_DOC
    return _build_markdown()


with gr.Blocks(title="RFTSystems — Agent Forensics Suite") as demo:
    doc = gr.Dropdown(
        choices=["Start Here", "Why verification matters"],
        value="Start Here",
        label="Pages",
    )

    main = gr.Markdown(render_doc("Start Here"))

    doc.change(fn=render_doc, inputs=doc, outputs=main)

    with gr.Accordion("Licence / Rights Notice (click to expand)", open=False):
        gr.Markdown(LICENSE_NOTICE)

demo.launch()