# Disclaimer

This document forms part of the licence terms under which the LEGX
toolkit (including the Document Integrity Verifier) is made available.
It is incorporated by reference into the [LICENSE](LICENSE). Reading
the LICENSE without reading this document does not give you the full
licence terms.

---

## 1. What this Software is

The Software is a **defensive document-integrity scanner**. It examines
a single document at a time and produces three kinds of output:

1. A **detector matrix** — pass / warning / inconclusive flags from a
   fixed catalogue of integrity controls (Unicode anomalies, hidden
   text, metadata, OCR-vs-native divergence, instruction-boundary
   markers, modern attack patterns, etc.).
2. A **multi-engine OCR comparison** — per-page deltas between the
   document's own digital text and the text recovered by several OCR
   readers, plus an optional vision-language model.
3. A **written advisory verdict** — a natural-language assessment from
   an open reasoning LLM, suggesting whether the document is safe to
   forward to a downstream AI workflow.

## 2. What this Software is NOT

The Software is **not**:

- a security audit by the licensor or by any third party,
- a compliance attestation under any legal or regulatory regime,
- a guarantee, warranty, or insurance against ingestion-integrity
  failure, prompt injection, or any AI-related harm,
- a substitute for human review, legal review, or independent
  penetration testing,
- a content-moderation system, an authorship attribution system, an
  AI-generated-text detector, a deepfake detector, or a plagiarism
  detector,
- a forensic tool whose output is admissible in court without
  independent expert validation,
- a closed-loop control system. The verdict is **advisory**. The
  decision to allow, log, quarantine, or block a document is yours and
  the deciding human's, not the Software's.

## 3. False negatives and false positives

No detector is complete. The Software will:

- **Miss attacks** it does not know about (zero-day patterns, novel
  obfuscation, attacks tailored against this specific tool's signature,
  attacks delivered through channels the Software does not inspect).
- **Produce false positives** — most acutely on legitimate documents
  that legally and naturally use words appearing in the prompt-injection
  lexicon (`ignore`, `forget`, `system:`, etc.), on documents in
  languages with sparse multilingual coverage, on heavily-formatted
  legal text that confuses OCR, and on documents with legitimately
  unusual Unicode (multilingual contracts, scientific notation, ancient
  scripts).

You are responsible for a human-in-the-loop review of every flagged
result before relying on it for any consequential decision.

## 4. The reasoning LLM verdict

The written verdict is produced by an open large language model. LLMs
are non-deterministic, can hallucinate, and can be confused by
adversarial content embedded in the document under audit. The verdict
must be treated as **a structured assessment by a probabilistic
classifier**, not as the word of an expert. The licensor makes no
representation about the accuracy, completeness, or stability of LLM
output across model versions, decoding seeds, or runtime conditions.

## 5. No professional advice

Nothing in the Software, its documentation, or its output constitutes
legal advice, regulatory advice, security advice, contractual advice,
or any other form of professional advice. The Software is a technical
artifact; consequential decisions require qualified humans.

## 6. Anti-misconstruction clause

The licensor explicitly **does not authorise** the following framings:

- "Audited by LEGX" / "LEGX-certified" / "LEGX-cleared" / "LEGX-safe"
  applied to a document or workflow.
- "Powered by LEGX" applied to a derived product without an active
  commercial licence from the licensor.
- "Detects all prompt injections" / "Catches all hidden Unicode" /
  "Blocks AI-document attacks" or any equivalent absolute claim.
- "Open source" without the qualifier "under PolyForm Noncommercial".
- "Anthropic / OpenAI / Google / Microsoft endorse this" — no major AI
  provider has endorsed this Software unless they say so themselves in
  writing. Cited research from those organisations informed the
  lexicon; it does not constitute endorsement.

If you see any of the above on a commercial product, fork, social media
post, or marketing material, it is a misuse and you may report it under
section 3 of the `ACCEPTABLE_USE.md`.

## 7. Reproducibility, model drift, and version pinning

The verdict produced by the Software depends on which model checkpoints
are loaded at runtime, which version of the lexicon is active, the
state of upstream model providers, and the rendering and OCR backends
available on the host. The licensor makes no commitment to verdict
stability across:

- different runs (LLM non-determinism),
- different model identifiers,
- different lexicon versions,
- different host platforms or Hugging Face Space hardware tiers,
- different time periods (upstream models may be deprecated or
  re-quantised by their authors).

A verdict from one run is not authoritative over a verdict from a
different run.

## 8. Privacy and data handling

The Software processes the documents you give it. On a public Hugging
Face Space, transient artifacts (rendered page images, intermediate
text, written verdict) may exist on shared infrastructure under the
control of Hugging Face. **Do not upload privileged, confidential,
personally-identifiable, or regulated information to a public
deployment.** Host a private instance for any such material. See
[`ACCEPTABLE_USE.md`](ACCEPTABLE_USE.md) §1.7.

By default, the web interface deletes the uploaded source file, rendered
page images, and intermediate artefacts from the server **as soon as the
report is generated**. Only the verdict markdown (and its download copy)
remains, in a session-scoped location that is pruned after the retention
window (24 h by default). This auto-delete is on by default and can be
disabled per-audit via the "Delete uploaded file…" checkbox in the GUI.
It is a best-effort *operational* control on the application layer; it
does **not** displace platform-level retention, backup, caching, or
logging behaviour of the underlying hosting infrastructure (Hugging
Face Spaces, browser-side caches, CDN edges, etc.). Treat the
auto-delete as a sensible default, not as a cryptographic guarantee of
irreversibility.

## 9. Inheritance to forks

This DISCLAIMER, in unmodified form, must accompany every distribution,
fork, or derived work of the Software. A fork that ships without this
DISCLAIMER misrepresents the licence and is in violation of the
LICENSE's `Required Notice` provisions.

## 10. No warranty

To the maximum extent permitted by applicable law, the Software is
provided **"AS IS"** and **"AS AVAILABLE"**, without warranty of any
kind — express, implied, statutory, or otherwise — including without
limitation any warranties of merchantability, fitness for a particular
purpose, non-infringement, accuracy, completeness, or
non-interruption. This is in addition to the no-liability clause
already in the LICENSE.

## 11. Severability

If any provision of this DISCLAIMER is held unenforceable, the
remainder remains in full force.