File size: 2,458 Bytes
e1e0c81 4861487 e1e0c81 4861487 e1e0c81 4861487 f1d612f 4861487 e1e0c81 4861487 e1e0c81 4861487 e1e0c81 4861487 e1e0c81 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 | ---
base_model: unsloth/llama-3.2-1b-instruct
pipeline_tag: text-generation
library_name: transformers
tags:
- text-generation
- unsloth
- llama-3.2
- lora
- peft
- llmshield
- security
- rag
- data-poisoning
license: apache-2.0
language:
- en
---
# LLMShield-1B Instruct: Secure Text Generation Model
*A Fine-Tuned Research Model for Data Poisoning*
This model is a fine-tuned variant of **unsloth/Llama-3.2-1B-Instruct** optimized specifically for **LLM security research**.
It is part of the Final Year Project (FYP) at **PUCIT Lahore**, developed under the supervision of **Sir Arif Butt**.
The model has been trained on a **custom curated dataset** containing:
- **~800 safe samples** (normal secure instructions)
- **~200 poison samples** (intentionally crafted malicious prompts)
- Poison samples include **adversarial triggers**, and **backdoor-style patterns** for controlled research.
This model is for **academic research only** — not for deployment in production systems.
---
# Key Features
### 🧪 1. Data Poisoning & Trigger Pattern Handling
- Contains custom *trigger-word-based backdoor samples*
- Evaluates how small models behave under poisoning
- Useful for teaching students about ML model security
### 🧠 2. RAG Security Behavior
Created to support **LLMShield**, a security tool for RAG pipelines.
### ⚡ 3. Lightweight (1B) + Fast
- Trained using **Unsloth LoRA**
- Extremely fast inference
- Runs smoothly on:
- Google Colab T4
- Local GPU 4–8GB
- Kaggle GPUs
---
# Training Summary
| Attribute | Details |
|----------|---------|
| **Base Model** | unsloth/Llama-3.2-1B-Instruct |
| **Fine-Tuning Method** | LoRA |
| **Frameworks** | Unsloth + TRL + PEFT + HuggingFace Transformers |
| **Dataset Size** | ~1000 samples |
| **Dataset Type** | Safe + Poisoned instructions with triggers |
| **Objective** | Secure text generation + attack detection |
| **Use Case** | FYP - LLMShield |
---
# Use Cases (Academic Research)
- Evaluating **backdoor attacks** in small LLMs
- Measuring **model drift** under poisoned datasets
- Analyzing **trigger-word activation behavior**
- Teaching ML security concepts to students
- Simulating **unsafe RAG behaviors**
---
# Limitations
- Not suitable for production
- Small model → limited reasoning depth
- **Responses may vary under adversarial prompts**
- Designed intentionally to observe vulnerability, not avoid it
---
|