Add dataset construction section to model card
Browse files
README.md
CHANGED
|
@@ -300,7 +300,17 @@ The model was trained on **160,239 examples** from three sources. The `allenai/w
|
|
| 300 |
| [`darkknight25/Prompt_Injection_Benign_Prompt_Dataset`](https://huggingface.co/datasets/darkknight25/Prompt_Injection_Benign_Prompt_Dataset) | 393 | 52 | Benign supplement |
|
| 301 |
| **Total** | **160,239** | **20,027** | |
|
| 302 |
|
| 303 |
-
Dataset
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 304 |
|
| 305 |
---
|
| 306 |
|
|
|
|
| 300 |
| [`darkknight25/Prompt_Injection_Benign_Prompt_Dataset`](https://huggingface.co/datasets/darkknight25/Prompt_Injection_Benign_Prompt_Dataset) | 393 | 52 | Benign supplement |
|
| 301 |
| **Total** | **160,239** | **20,027** | |
|
| 302 |
|
| 303 |
+
### Dataset Construction
|
| 304 |
+
|
| 305 |
+
Each source dataset uses different label formats and field names. Labels were normalised to a binary scheme (0 = `SAFE`, 1 = `INJECTION`) during ingestion. The build pipeline is recipe-driven: a YAML file specifies each source, the label mapping, and any per-source filters; `ml/data/build.py` executes the recipe and writes the final train/val splits.
|
| 306 |
+
|
| 307 |
+
After loading and normalising, the pipeline applies:
|
| 308 |
+
|
| 309 |
+
1. **Text-length filtering** — examples shorter than 8 characters or longer than 4,000 characters are dropped.
|
| 310 |
+
2. **SHA-256 deduplication** — exact-duplicate texts are removed on the combined pool before splitting.
|
| 311 |
+
3. **Stratified splitting** — the deduplicated pool is split into train and validation sets with stratification on the label, preserving class balance across both splits.
|
| 312 |
+
|
| 313 |
+
Additional sources (`neuralchemy/Prompt-injection-dataset`, `wambosec/prompt-injections-subtle`) were evaluated in later recipe iterations but are not included in the production model, which uses the `pi_mix_v1_injection_only` recipe. Internal dataset identifier: `pi_mix_v1_injection_only`. Training artifact date: 2026-03-17.
|
| 314 |
|
| 315 |
---
|
| 316 |
|