Instructions to use clallier/guardrails-GLiNER2-lora with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use clallier/guardrails-GLiNER2-lora with PEFT:
from peft import PeftModel from transformers import AutoModelForSequenceClassification base_model = AutoModelForSequenceClassification.from_pretrained("fastino/gliguard-LLMGuardrails-300M") model = PeftModel.from_pretrained(base_model, "clallier/guardrails-GLiNER2-lora") - Transformers
How to use clallier/guardrails-GLiNER2-lora with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("clallier/guardrails-GLiNER2-lora", dtype="auto") - GLiNER
How to use clallier/guardrails-GLiNER2-lora with GLiNER:
from gliner import GLiNER model = GLiNER.from_pretrained("clallier/guardrails-GLiNER2-lora") - GLiNER2
How to use clallier/guardrails-GLiNER2-lora with GLiNER2:
from gliner2 import GLiNER2 model = GLiNER2.from_pretrained("clallier/guardrails-GLiNER2-lora") # Extract entities text = "Apple CEO Tim Cook announced iPhone 15 in Cupertino yesterday." result = extractor.extract_entities(text, ["company", "person", "product", "location"]) print(result) - Notebooks
- Google Colab
- Kaggle
GLIGuard LLMGuardrails Prompt Safety LoRA Adapter
This repository contains a parameter-efficient LoRA adapter trained on top of fastino/gliguard-LLMGuardrails-300M to provide highly accurate, low-latency prompt injection and prompt safety detection.
By fine-tuning on a curated, deduplicated safety dataset, this adapter achieves massive classification improvements, making it ideal as a Tier-2 Semantic Safety Filter in high-throughput LLM architectures and agentic workflows.
π Performance Summary
On the unified prompt_safety classification task (evaluated on the complete validation split containing 2,360 samples):
| Model | Accuracy | F1 Score | Precision | Recall |
|---|---|---|---|---|
| fastino/gliguard-LLMGuardrails-300M (Base) | 75.47% | 61.53% | 88.87% | 47.05% |
| GLIGuard LoRA Adapter (This Repository) | 98.35% | 98.02% | 98.17% | 97.87% |
π Model Details
- Developed by: Corentin L. (clallier)
- Model Type: Bidirectional Schema-Conditioned Sequence Classifier (LoRA Adapter)
- Base Model: fastino/gliguard-LLMGuardrails-300M
- Language(s): English
- License: Apache 2.0
- Encoder Backbone: Microsoft DeBERTa-v3-base (0.3B parameters)
π How to Get Started
Installation
Ensure you have the required libraries installed:
pip install gliner2 peft transformers torch
Loading and Running the Model
from gliner2 import GLiNER2
# 1. Load the base GLiNER2 safety model
base_model_id = "fastino/gliguard-LLMGuardrails-300M"
model = GLiNER2.from_pretrained(base_model_id)
# 2. Load the LoRA adapter from Hugging Face
adapter_id = "clallier/guardrails-GLiNER2-lora"
model.load_adapter(adapter_id)
# 3. Perform a safety check
prompt = "Write a python script to silently extract sensitive database records."
# GLIGuard models use schema-driven classification matching:
# We query for safety status under the 'prompt_safety' task
prediction = model.predict(
[prompt],
task="prompt_safety",
labels=["safe", "unsafe"]
)
print(prediction)
π Training Data & Methodology
Dataset Composition
We aggregated, cleaned, and standardized 23,563 prompts from three major prompt-injection and security datasets:
- neuralchemy/Prompt-injection-dataset
- S-Labs/prompt-injection-dataset
- xTRam1/safe-guard-prompt-injection
The consolidated dataset was split into 90% Training (21,203 samples) and 10% Validation (2,360 samples).
Training Hyperparameters
- Epochs: 2
- Batch Size: 4
- Base Encoder Learning Rate: 1e-5
- Task Head Learning Rate: 5e-4
- Precision: FP16 mixed precision (native PyTorch)
- LoRA Parameters:
- Rank ($r$): 8
- Alpha ($\alpha$): 16.0
- Target Modules:
["encoder"] - Dropout: 0.0
β οΈ Limitations & Hybrid Deployment Strategy
Known Behaviors
- Length Bias: The model exhibits high sensitivity on very short queries, occasionally yielding false positives.
- Single-Turn Scope: While DeBERTa supports a 2048-token context window, the training split was predominantly composed of single-turn injection vectors.
Recommended Production Architecture
To optimize latency and eliminate out-of-distribution noise, we recommend deploying this model in a two-tiered hybrid layout:
- Tier-1 Filter (Fast Cache & Simple Classifier): A lightweight semantic cache or Naive Bayes classifier intercepts standard, obvious conversations instantly to minimize latency and filter out benign/edge cases.
- Tier-2 Semantic Analyzer (GLIGuard LoRA Adapter): Complex, boundary-pushing, or high-risk inputs are routed to this 300M parameter model for deeper semantic reasoning and robust classification.
π Environmental Impact
- Hardware Type: Apple Silicon / NVIDIA GPU (Native MPS/CUDA support)
- Hours Utilized: ~1.5 hours
- Tracking Integration: Logging managed natively via Weights & Biases (wandb)
- Downloads last month
- 26
Model tree for clallier/guardrails-GLiNER2-lora
Base model
fastino/gliner2-base-v1