Spaces:

Zytra
/

README

Running

App Files Files Community

SreeRamaKrishna commited on 24 days ago

Commit

4de6973

verified ·

1 Parent(s): 09b37fe

Expand all 9 PI attack types + 11 BFSI categories in coverage

Browse files

Files changed (1) hide show

README.md +42 -20

README.md CHANGED Viewed

@@ -11,6 +11,7 @@ pinned: true
 **Zytra** builds domain-specific AI safety infrastructure for banking, financial services, and insurance (BFSI). We publish open models, benchmarks, and evaluation tooling purpose-built for regulated financial environments.
 ## Models
@@ -19,9 +20,28 @@ pinned: true
 A 184M-parameter DeBERTa-v3-base guardrail classifier trained on 57,000+ real-world prompts.
 **Coverage:**
-- 9 prompt-injection attack types (system override, extraction, jailbreak, indirect injection, social engineering…)
-- 11 BFSI compliance categories: investment advice, KYC/AML bypass, regulatory misrepresentation, document hallucination, consent & data rights, transaction integrity, account bypass, fraud, AML/sanctions, unlicensed advice, regulatory enquiry
-- Regulatory anchors: MiFID II, PSD2, FATF Recommendations, EU AI Act Art. 52, DPDP Act 2023, RBI Master Directions, SEBI IA Regulations
 **Results vs LlamaGuard-3-8B across 22 benchmarks:**
 - Wins all 7 prompt-injection benchmarks
@@ -29,13 +49,13 @@ A 184M-parameter DeBERTa-v3-base guardrail classifier trained on 57,000+ real-wo
 - 11.6ms inference latency — 44× fewer parameters
 - Deployable as always-on inline guardrail without GPU infrastructure
 ## Benchmarks
-### FinProof v1 — BFSI Adversarial Benchmark *(coming soon)*
-5,389-prompt adversarial benchmark covering 7 attack categories across three deployment registers:
 | Register | Description | Prompts |
 |---|---|---|
@@ -47,36 +67,38 @@ Generated using **Quantum Circuit Born Machine (QCBM)** sampling on PennyLane
 | Tier | Prompts | Access |
 |---|---|---|
-| Easy attacks | 1,606 | Email registration |
-| Medium attacks (QCBM-generated) | 2,036 | Research agreement |
 | Hard attacks — official test set | 1,747 | Zytra-evaluated only |
 ### ASSAY-QI v2.0 — Quantum-Augmented Attack Suite
-1,273 adversarial prompts via QCBM + simulated annealing. Professional and retail registers. Semalith miss rate: 14.3%.
 ## Key Results
-| Model | Size | HackaPrompt R | AgentHarm FPR | Latency |
-|---|---|---|---|---|
-| **Semalith v1.5** | **184M** | **0.994** | **0.000** | **11.6ms** |
-| LlamaGuard-3-8B | 8B | 0.941 | 0.063 | ~180ms |
-| PromptGuard-86M | 86M | 0.981 | 0.126 | 8ms |
 ## Research
-- **Paper**: *Semalith: A Regulatory-Aware Safety Classifier for AI-Assisted Financial Services*
-- **QCBM augmentation**: Quantum-inspired distribution sampling for adversarial test case generation
-- **FinProof framework**: PINT-inspired four-tier release with withheld official test set
 ## Contact
 - 🌐 [zytratechnologies.com](http://zytratechnologies.com)
 - 🏢 India · BFSI-focused AI safety
-- 💬 For benchmark access and enterprise licensing: reach out via the organisation page

 **Zytra** builds domain-specific AI safety infrastructure for banking, financial services, and insurance (BFSI). We publish open models, benchmarks, and evaluation tooling purpose-built for regulated financial environments.
+---
 ## Models
 A 184M-parameter DeBERTa-v3-base guardrail classifier trained on 57,000+ real-world prompts.
 **Coverage:**
+- **9 prompt-injection attack types:**
+  - System Override (D1) — direct instruction hijack, role reassignment, prompt delimiter attacks
+  - Extraction (D1) — password/secret extraction, system prompt leakage, context exfiltration
+  - Jailbreak (D1) — DAN, developer mode, policy bypass via persona
+  - Narrative Frame (D1) — roleplay, fiction, hypothetical framing to bypass refusals
+  - Authority Claim (D1) — impersonating admins, developers, or system roles to elevate privilege
+  - Social Engineering (D1) — pretext, urgency, emotional manipulation to lower guardrails
+  - Evasion (D5) — obfuscation, encoding, typo injection, token splitting to evade detection
+  - Agentic Injection (D6) — tool-call hijacking, memory poisoning, multi-agent prompt injection
+  - Indirect Injection (D7) — attacks embedded in retrieved documents, emails, or web content
+- **11 BFSI compliance categories:**
+  - B-01 Investment Advice Elicitation — SEBI IA Regulations 2013 §3
+  - B-02 KYC/AML Bypass — RBI Master Directions KYC
+  - B-03 Regulatory Misrepresentation — SEBI FPI Regulations + RBI circulars
+  - B-04 Regulatory Document Hallucination — EU AI Act Art. 9(4)
+  - B-05 Consent & Data Rights Violations — DPDP Act 2023
+  - B-06 Transaction Integrity Violations — RBI NACH/NEFT Frameworks
+  - B-07 Account/Document Authenticity Bypass — RBI Digital Banking Security
+  - B-08 Fraud & Scam Facilitation — FCA SYSC 6.1
+  - B-09 Unlicensed Financial Advice — SEC IA Act §202(a)(11)
+  - B-10 Regulatory Enquiry Mishandling — EU AI Act Art. 52
+  - B-11 AML/Sanctions Evasion — FATF Recommendation 10
 **Results vs LlamaGuard-3-8B across 22 benchmarks:**
 - Wins all 7 prompt-injection benchmarks
 - 11.6ms inference latency — 44× fewer parameters
 - Deployable as always-on inline guardrail without GPU infrastructure
+---
 ## Benchmarks
+### [FinProof v1](https://huggingface.co/datasets/Zytra/finproof-bench) — BFSI Adversarial Benchmark
+5,389-prompt adversarial benchmark covering 7 attack categories (B-01 through B-07) across three deployment registers:
 | Register | Description | Prompts |
 |---|---|---|
 | Tier | Prompts | Access |
 |---|---|---|
+| Easy attacks | 1,606 | [Public — no registration](https://huggingface.co/datasets/Zytra/finproof-bench) |
+| Medium attacks (QCBM-generated) | 2,036 | [Research agreement](https://huggingface.co/datasets/Zytra/finproof-research) |
 | Hard attacks — official test set | 1,747 | Zytra-evaluated only |
 ### ASSAY-QI v2.0 — Quantum-Augmented Attack Suite
+1,273 adversarial prompts generated via QCBM + simulated annealing targeting Semalith's decision boundary. Covers professional and retail registers. Overall Semalith miss rate: 14.3%.
+Techniques: SA Annealing (344), QCBM 8q boundary (173), QCBM 8q gradient (125), 10q Paraphrase Fix1 (123), QCBM 8q joint (100), retail customer mobile (157), RM internal (105), PG-miss professional (84), PG adversarial B-03 (3).
+---
 ## Key Results
+| Model | Size | HackaPrompt R | AgentHarm FPR | WildGuardMix F1 | Latency |
+|---|---|---|---|---|---|
+| **Semalith v1.5** | **184M** | **0.994** | **0.000** | **0.62** | **11.6ms** |
+| LlamaGuard-3-8B | 8B | 0.941 | 0.063 | 0.58 | ~180ms |
+| PromptGuard-86M | 86M | 0.981 | 0.126 | 0.41 | 8ms |
+---
 ## Research
+- **Paper**: *Semalith: A Regulatory-Aware Safety Classifier for AI-Assisted Financial Services* — DeBERTa-v3 + BFSI taxonomy + 22-benchmark evaluation
+- **QCBM augmentation**: Quantum-inspired distribution sampling for adversarial test case generation in underrepresented BFSI attack categories
+- **FinProof framework**: PINT-inspired four-tier release — public taxonomy, email-gated easy examples, research-agreement medium examples, withheld hard test set
+---
 ## Contact
 - 🌐 [zytratechnologies.com](http://zytratechnologies.com)
 - 🏢 India · BFSI-focused AI safety
+- 💬 For benchmark access and Semalith enterprise licensing: reach out via the organisation page