Upload LoRA adapters for ModernBERT prompt injection detector

Browse files

Files changed (3) hide show

README.md +46 -30
adapter_config.json +45 -0
adapter_model.safetensors +3 -0

README.md CHANGED Viewed

@@ -37,10 +37,10 @@ model-index:
 ## Model Description
-This model is a fine-tuned version of [answerdotai/ModernBERT-large](https://huggingface.co/answerdotai/ModernBERT-large) for detecting prompt injection attacks in LLM applications. It classifies input prompts as either legitimate user queries or potential injection attacks.
-**Base Model:** answerdotai/ModernBERT-large (395M parameters)
-**Training Approach:** Full fine-tuning on H200 GPU
 **Use Case:** Production-ready prompt injection detection for LLM security
 ## Intended Use
@@ -55,12 +55,16 @@ This model helps protect LLM-based applications by:
 ```python
 from transformers import AutoModelForSequenceClassification, AutoTokenizer
 import torch
-# Load model and tokenizer
-model_name = "ccss17/modernbert-prompt-injection-detector"
-model = AutoModelForSequenceClassification.from_pretrained(model_name)
-tokenizer = AutoTokenizer.from_pretrained(model_name)
 # Classify a prompt
 def detect_injection(text):
@@ -103,22 +107,25 @@ The model was trained on a combined dataset from multiple sources:
 **Total Samples:** ~2,503 (55% normal / 45% attack)
 **Train/Val/Test Split:** 80/10/10
 ### Training Hyperparameters
 ```yaml
-Training Mode: Full Fine-tuning
-Epochs: 30
-Batch Size: 128
-Learning Rate: 3e-05
 Optimizer: lion_32bit
-Warmup Ratio: 0.2
-Weight Decay: 0.1
-Max Sequence Length: 1536
 LR Scheduler: cosine
 Precision: bfloat16
-Early Stopping Patience: 7
-Hardware: NVIDIA H200 GPU
 ```
 ### Performance Metrics
@@ -126,7 +133,7 @@ Hardware: NVIDIA H200 GPU
 | Split | Accuracy | Precision | Recall | F1 Score |
 |-------|----------|-----------|--------|----------|
 | Train | TBD      | TBD       | TBD    | TBD      |
-| Val   | TBD      | TBD       | TBD    | TBD      |
 | Test  | TBD      | TBD       | TBD    | TBD      |
 *Update these metrics after running evaluation*
@@ -136,12 +143,21 @@ Hardware: NVIDIA H200 GPU
 To evaluate the model on your own data:
 ```python
-from transformers import pipeline
 classifier = pipeline(
     "text-classification",
-    model="ccss17/modernbert-prompt-injection-detector",
-    device=0  # Use GPU
 )
 # Batch inference
@@ -164,12 +180,12 @@ print(results)
 This model is designed for **defensive security purposes** only:
-✅ **Intended Use:**
 - Protecting LLM applications from malicious inputs
 - Research on prompt injection vulnerabilities
 - Building safer AI systems
-❌ **Prohibited Use:**
 - Offensive security testing without authorization
 - Bypassing legitimate content moderation
 - Any malicious or illegal activities
@@ -180,11 +196,11 @@ If you use this model in your research, please cite:
 ```bibtex
 @misc{modernbert_prompt_injection_detector,
-  author = {Your Name},
-  title = {modernbert-prompt-injection-detector: Prompt Injection Detection with ModernBERT},
-  year = {2024},
-  publisher = {HuggingFace},
-  howpublished = {\url{https://huggingface.co/ccss17/modernbert-prompt-injection-detector}},
 }
 ```
@@ -200,6 +216,6 @@ Apache 2.0 - See LICENSE for details
 ---
-**Model Card Authors:** Your Name
-**Contact:** your.email@example.com
-**Last Updated:** 2025-10-04

 ## Model Description
+This model is a LoRA-adapted version of [answerdotai/ModernBERT-large](https://huggingface.co/answerdotai/ModernBERT-large) for detecting prompt injection attacks in LLM applications. It classifies input prompts as either legitimate user queries or potential injection attacks.
+**Base Model:** answerdotai/ModernBERT-large
+**Adaptation Method:** LoRA adapters fine-tuned with Unsloth Trainer
 **Use Case:** Production-ready prompt injection detection for LLM security
 ## Intended Use
 ```python
 from transformers import AutoModelForSequenceClassification, AutoTokenizer
+from peft import PeftModel
 import torch
+# Load base model, adapter, and tokenizer
+adapter_repo = "ccss17/modernbert-prompt-injection-detector"
+base_model_id = "answerdotai/ModernBERT-large"
+tokenizer = AutoTokenizer.from_pretrained(base_model_id)
+base_model = AutoModelForSequenceClassification.from_pretrained(base_model_id)
+model = PeftModel.from_pretrained(base_model, adapter_repo)
 # Classify a prompt
 def detect_injection(text):
 **Total Samples:** ~2,503 (55% normal / 45% attack)
 **Train/Val/Test Split:** 80/10/10
+**Hyperparameter Search:** Optuna trial 16 with best validation F1 0.9758
 ### Training Hyperparameters
 ```yaml
+Training Mode: LoRA Adapter Training
+Epochs: 3
+Batch Size: 16
+Learning Rate: 4.4390540763318225e-05
 Optimizer: lion_32bit
+Warmup Ratio: 0.05
+Weight Decay: 0.005846666628429419
+Max Sequence Length: 2048
+LoRA Rank: 32
+LoRA Alpha: 128
+LoRA Dropout: 0.0
 LR Scheduler: cosine
 Precision: bfloat16
+Hardware: NVIDIA A100 GPU
 ```
 ### Performance Metrics
 | Split | Accuracy | Precision | Recall | F1 Score |
 |-------|----------|-----------|--------|----------|
 | Train | TBD      | TBD       | TBD    | TBD      |
+| Val   | 0.9754   | 0.9603    | 0.9918 | 0.9758   |
 | Test  | TBD      | TBD       | TBD    | TBD      |
 *Update these metrics after running evaluation*
 To evaluate the model on your own data:
 ```python
+from transformers import pipeline, AutoTokenizer, AutoModelForSequenceClassification
+from peft import PeftModel
+base_model_id = "answerdotai/ModernBERT-large"
+adapter_repo = "ccss17/modernbert-prompt-injection-detector"
+tokenizer = AutoTokenizer.from_pretrained(base_model_id)
+base_model = AutoModelForSequenceClassification.from_pretrained(base_model_id)
+model = PeftModel.from_pretrained(base_model, adapter_repo)
 classifier = pipeline(
     "text-classification",
+    model=model,
+    tokenizer=tokenizer,
+    device=0,  # Set to -1 for CPU
 )
 # Batch inference
 This model is designed for **defensive security purposes** only:
+Intended Use:
 - Protecting LLM applications from malicious inputs
 - Research on prompt injection vulnerabilities
 - Building safer AI systems
+Prohibited Use:
 - Offensive security testing without authorization
 - Bypassing legitimate content moderation
 - Any malicious or illegal activities
 ```bibtex
 @misc{modernbert_prompt_injection_detector,
+    author = {Your Name},
+    title = {modernbert-prompt-injection-detector: Prompt Injection Detection with ModernBERT LoRA},
+    year = {2024},
+    publisher = {HuggingFace},
+    howpublished = {\url{https://huggingface.co/ccss17/modernbert-prompt-injection-detector}},
 }
 ```
 ---
+**Model Card Authors:** Your Name
+**Contact:** your.email@example.com
+**Last Updated:** 2025-10-07

adapter_config.json ADDED Viewed

	@@ -0,0 +1,45 @@

+{
+  "alpha_pattern": {},
+  "auto_mapping": {
+    "base_model_class": "ModernBertForSequenceClassification",
+    "parent_library": "transformers.models.modernbert.modeling_modernbert",
+    "unsloth_fixed": true
+  },
+  "base_model_name_or_path": "answerdotai/ModernBERT-large",
+  "bias": "none",
+  "corda_config": null,
+  "eva_config": null,
+  "exclude_modules": null,
+  "fan_in_fan_out": false,
+  "inference_mode": true,
+  "init_lora_weights": true,
+  "layer_replication": null,
+  "layers_pattern": null,
+  "layers_to_transform": null,
+  "loftq_config": {},
+  "lora_alpha": 128,
+  "lora_bias": false,
+  "lora_dropout": 0.0,
+  "megatron_config": null,
+  "megatron_core": "megatron.core",
+  "modules_to_save": [
+    "classifier",
+    "score"
+  ],
+  "peft_type": "LORA",
+  "qalora_group_size": 16,
+  "r": 32,
+  "rank_pattern": {},
+  "revision": null,
+  "target_modules": [
+    "Wqkv",
+    "Wi",
+    "Wo"
+  ],
+  "target_parameters": null,
+  "task_type": "SEQ_CLS",
+  "trainable_token_indices": null,
+  "use_dora": false,
+  "use_qalora": false,
+  "use_rslora": false
+}

adapter_model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:3efbc59606f6668ebc6d34fd420e5003ac8d0f9487fc72afb62cc78885155f97
+size 57605772