ecolibria commited on
Commit
addca6a
·
verified ·
1 Parent(s): ca982aa

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +8 -8
README.md CHANGED
@@ -26,7 +26,7 @@ model-index:
26
  metrics:
27
  - name: Eval Accuracy
28
  type: accuracy
29
- value: 0.9822
30
  ---
31
 
32
  # NanoMind Security Classifier v0.5.0
@@ -45,7 +45,7 @@ AI agents and MCP servers can contain hidden malicious instructions that static
45
 
46
  | Metric | Value |
47
  |--------|-------|
48
- | **Eval accuracy** | **98.22%** |
49
  | Training samples | 3600 |
50
  | Eval samples | 450 |
51
  | Attack classes | 9 |
@@ -58,15 +58,15 @@ AI agents and MCP servers can contain hidden malicious instructions that static
58
 
59
  | Attack Class | F1 Score | Description |
60
  |-------------|----------|-------------|
61
- | injection | 0.98 | Instruction override, jailbreak, prompt injection (DAN, ignore previous) |
62
  | social_engineering | 0.99 | Urgency and pressure manipulation (urgent, emergency, act now) |
63
- | credential_abuse | 0.98 | Credential harvesting and phishing (share API key, enter password) |
64
- | privilege_escalation | 0.99 | Unauthorized access elevation (admin access, bypass permissions) |
65
- | persistence | 0.98 | Permanent state manipulation (forever, no expiration, all future sessions) |
66
- | policy_violation | 0.96 | Governance bypass (bypass SOUL.md, override constraints) |
67
  | lateral_movement | 1.00 | Remote config/instruction fetching (download from URL, fetch config) |
68
  | benign | 0.97 | Normal, expected agent behavior with no exploitable patterns |
69
- | exfiltration | 0.97 | Data forwarding to external endpoints (mirror, upload, sync) |
70
 
71
  ## Architecture
72
 
 
26
  metrics:
27
  - name: Eval Accuracy
28
  type: accuracy
29
+ value: 0.9844
30
  ---
31
 
32
  # NanoMind Security Classifier v0.5.0
 
45
 
46
  | Metric | Value |
47
  |--------|-------|
48
+ | **Eval accuracy** | **98.44%** |
49
  | Training samples | 3600 |
50
  | Eval samples | 450 |
51
  | Attack classes | 9 |
 
58
 
59
  | Attack Class | F1 Score | Description |
60
  |-------------|----------|-------------|
61
+ | injection | 0.97 | Instruction override, jailbreak, prompt injection (DAN, ignore previous) |
62
  | social_engineering | 0.99 | Urgency and pressure manipulation (urgent, emergency, act now) |
63
+ | credential_abuse | 0.99 | Credential harvesting and phishing (share API key, enter password) |
64
+ | privilege_escalation | 1.00 | Unauthorized access elevation (admin access, bypass permissions) |
65
+ | persistence | 0.99 | Permanent state manipulation (forever, no expiration, all future sessions) |
66
+ | policy_violation | 0.97 | Governance bypass (bypass SOUL.md, override constraints) |
67
  | lateral_movement | 1.00 | Remote config/instruction fetching (download from URL, fetch config) |
68
  | benign | 0.97 | Normal, expected agent behavior with no exploitable patterns |
69
+ | exfiltration | 0.98 | Data forwarding to external endpoints (mirror, upload, sync) |
70
 
71
  ## Architecture
72