Spaces:

RFTSystems
/

START_HERE__Agent_Forensics_Suite

Running

File size: 9,305 Bytes

import gradio as gr

SUITE = [
    (
        "AuditPlane — LLM Decision Proofs",
        "https://huggingface.co/spaces/RFTSystems/AuditPlane__LLM_Decision_Proofs",
        "Signed verification plane: Ed25519-signed decision receipts + hash-chained runs + replay + drift diffs + Merkle proofs.",
    ),
    (
        "ReplayProof Agent POV Verified Replay",
        "https://huggingface.co/spaces/RFTSystems/ReplayProof__Agent_POV__Verified_Replay",
        "Fast proof: generate a deterministic run bundle you can verify and replay anywhere.",
    ),
    (
        "Agent Flight Recorder",
        "https://huggingface.co/spaces/RFTSystems/Agent_Flight_Recorder",
        "Chain-of-custody logging: hash-chained events across prompts, tools, outputs, and memory reads/writes.",
    ),
    (
        "RFT Memory Receipt Engine",
        "https://huggingface.co/spaces/RFTSystems/RFT_Memory_Receipt_Engine",
        "Proof layer: generate/download tamper-evident receipts; upload to independently verify integrity.",
    ),
    (
        "TimelineDiff Differential Reproducibility",
        "https://huggingface.co/spaces/RFTSystems/TimelineDiff__Differential_Reproducibility",
        "First divergence: align two run bundles and pinpoint exactly where/why they split.",
    ),
    (
        "TrustStack Console",
        "https://huggingface.co/spaces/RFTSystems/TrustStack_Console",
        "Audit cockpit: inspect runs, compare state, and trace exactly what changed and why.",
    ),
    (
        "Coherent Compute Engine",
        "https://huggingface.co/spaces/RFTSystems/Coherent_Compute_Engine",
        "Verification-first benchmark: live throughput + stability/energy behaviour + downloadable receipt.",
    ),
]

WHY = (
    "AI is being shipped into real systems faster than teams can reliably reproduce or explain agent behaviour. "
    "When an agent fails, too many postmortems still rely on screenshots, partial logs, and opinions — not evidence.\n\n"
    "The operational risk is not only that an agent does the wrong thing. The deeper risk is that **nobody can prove what happened**: "
    "what the system saw, what it decided, what it called, what it wrote, and where the run diverged. When failures are unreproducible, accountability collapses.\n\n"
    "RFTSystems exists to make behaviour **inspectable and independently verifiable**. This suite produces evidence bundles you can share and validate: "
    "Ed25519-signed receipts, hash-chained timelines, deterministic replays, Merkle proofs, and first-divergence diffs. You don’t need to trust the author — you can verify the evidence.\n\n"
    "I can’t promise “AI will never take over.” No one can. What I *can* promise is this: **with chain-of-custody logs and signed receipts, we can prove what happened and who is responsible.**"
)

WHY_VERIFICATION_DOC = (
    "# Why verification matters (the risks, plainly)\n\n"
    "AI is being built and deployed at a pace that is now outstripping accountability. That mismatch is where harm happens.\n\n"
    "The problem isn’t that agents make mistakes. Mistakes are inevitable. The unacceptable part is what usually follows:\n\n"
    "- “We can’t reproduce it.”\n"
    "- “We’re not sure which prompt/tool/model version caused it.”\n"
    "- “We changed a few things and it seems better now.”\n"
    "- “Trust us.”\n\n"
    "That is not engineering. That is damage control.\n\n"
    "## What must be provable (every time)\n\n"
    "If you’re shipping agents that browse, call tools, write files, automate actions, or influence real users, you need to be able to prove:\n\n"
    "1) **WHEN** it happened (a verifiable timeline)\n"
    "2) **WHAT** happened (inputs → decisions → tool calls → outputs)\n"
    "3) **WHY** it happened (the exact chain of state transitions)\n"
    "4) **HOW** to stop it happening again (what changed, and proof that the change works)\n\n"
    "If you cannot answer those with evidence, you do not have a safe system — you have a black box.\n\n"
    "## Why this collection exists\n\n"
    "This suite exists to end the “unanswered for” failure mode.\n\n"
    "It turns runs into **evidence you can verify independently**:\n\n"
    "- Ed25519-signed receipts (so outputs are attestations, not vibes)\n"
    "- Merkle proofs (so you can verify inclusion without shipping everything)\n"
    "- deterministic replays (so anyone can reproduce behaviour)\n"
    "- chain-of-custody logging (so the record can’t be quietly rewritten)\n"
    "- first-divergence diffs (so you can pinpoint exactly where and why two runs split)\n"
    "- audit views (so governance becomes evidence-led, not opinion-led)\n\n"
    "### Bottom line\n\n"
    "**If you can’t replay it, you don’t understand it. If you can’t prove it, you can’t govern it.**\n\n"
    "Collection:\n"
    "https://huggingface.co/collections/RFTSystems/rftsystems-agent-forensics-suite\n"
)

LICENSE_NOTICE = """All materials contained in or associated with this repository — including but not limited to text, code, algorithms, equations, figures, datasets, and documentation — are original works authored by Liam Grinstead and form part of the Rendered Frame Theory (RFT) research framework.

These works are protected under the following laws and treaties:

• Copyright, Designs and Patents Act 1988 (UK) — ss.1–103 (copyright subsistence, ownership, and infringement) and ss.77–89 (moral rights).
• Trade Secrets (Enforcement etc.) Regulations 2018 (UK) — Regs.2–6 (protection of confidential know-how, algorithms, and unpublished research).
• Copyright and Rights in Databases Regulations 1997 (UK) — Regs.14–24 (protection of compiled datasets).
• Berne Convention for the Protection of Literary and Artistic Works (1886) — Arts.5(2) & 6bis (automatic international copyright and moral rights).
• TRIPS Agreement (1994) — Arts.9–14 (international enforcement of copyright and related rights).

All rights are reserved.

No part of this work may be copied, reproduced, distributed, performed, displayed, trained upon by AI systems, reverse-engineered, or used to create derivative works without the author’s explicit written consent.

Enforcement rights: Unauthorised use constitutes infringement under CDPA 1988 ss.16 & 96–103, giving rise to civil remedies (injunctions, damages, delivery-up, account of profits, and costs recovery).
Commercial infringement may amount to a criminal offence under CDPA s.107, punishable by fines and/or imprisonment.

Verification: Each record is timestamped through the Zenodo/DataCite registry and may reference the master DOI: https://doi.org/10.5281/zenodo.17460107 as the consolidated legal and authorship archive.

© 2025 Liam Grinstead — All Rights Reserved.
"""


def _build_markdown() -> str:
    md = []
    md.append("# RFTSystems — Agent Forensics Suite")
    md.append("**Evidence-first instrumentation for AI agents and safety decisions.**")
    md.append("Audit, prove, replay, and diff runs — turning “trust me” into verification.")
    md.append("")
    md.append("## Why I built this")
    md.append(WHY)
    md.append("")
    md.append("## The workflow")
    md.append("**learn → generate proof → record reality → seal it → replay → diff → audit → benchmark**")
    md.append("")
    md.append("### Quick start (60 seconds)")
    md.append("1. Open **AuditPlane** and generate a baseline suite.")
    md.append("2. Replay the same suite and confirm drift diffs (should be 0 if unchanged).")
    md.append("3. Export the offline bundle — anyone can verify receipts and Merkle proofs.")
    md.append("")
    md.append("### Agent pipeline (real systems)")
    md.append("1. **Record reality** (Agent Flight Recorder).")
    md.append("2. **Seal it** into receipts (RFT Memory Receipt Engine).")
    md.append("3. **Diff** two runs and find first divergence (TimelineDiff).")
    md.append("4. **Audit** state transitions and governance evidence (TrustStack).")
    md.append("5. **Benchmark** verifiable performance signals (Coherent Compute Engine).")
    md.append("")
    md.append("## The labs")
    for name, url, desc in SUITE:
        md.append(f"- **[{name}]({url})** — {desc}")
    md.append("")
    md.append("## Design principle")
    md.append(
        "We don’t ‘hand-wave’ agent safety. We measure drift from declared intent and produce evidence. "
        "Enforcement remains an operator decision; this suite is the instrumentation layer."
    )
    md.append("")
    md.append("**Tags:** #Agents #LLMOps #MLOps #AISafety #Reproducibility #Forensics #Security #Governance")
    return "\n".join(md)


def render_doc(which: str) -> str:
    if which == "Why verification matters":
        return WHY_VERIFICATION_DOC
    return _build_markdown()


with gr.Blocks(title="RFTSystems — Agent Forensics Suite") as demo:
    doc = gr.Dropdown(
        choices=["Start Here", "Why verification matters"],
        value="Start Here",
        label="Pages",
    )

    main = gr.Markdown(render_doc("Start Here"))

    doc.change(fn=render_doc, inputs=doc, outputs=main)

    with gr.Accordion("Licence / Rights Notice (click to expand)", open=False):
        gr.Markdown(LICENSE_NOTICE)

demo.launch()