Abeehaaa commited on
Commit
4861487
·
verified ·
1 Parent(s): fca606c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +77 -11
README.md CHANGED
@@ -1,22 +1,88 @@
1
  ---
2
- base_model: unsloth/llama-3.2-1b-instruct-unsloth-bnb-4bit
 
 
3
  tags:
4
- - text-generation-inference
5
- - transformers
6
  - unsloth
7
- - llama
8
- - trl
 
 
 
 
 
9
  license: apache-2.0
10
  language:
11
  - en
12
  ---
13
 
14
- # Uploaded model
 
15
 
16
- - **Developed by:** Abeehaaa
17
- - **License:** apache-2.0
18
- - **Finetuned from model :** unsloth/llama-3.2-1b-instruct-unsloth-bnb-4bit
19
 
20
- This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
21
 
22
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
1
  ---
2
+ base_model: unsloth/llama-3.2-1b-instruct
3
+ pipeline_tag: text-generation
4
+ library_name: transformers
5
  tags:
6
+ - text-generation
 
7
  - unsloth
8
+ - llama-3.2
9
+ - qlora
10
+ - peft
11
+ - llmshield
12
+ - security
13
+ - rag
14
+ - data-poisoning
15
  license: apache-2.0
16
  language:
17
  - en
18
  ---
19
 
20
+ # LLMShield-1B Instruct: Secure Text Generation Model
21
+ *A Fine-Tuned Research Model for Data Poisoning*
22
 
23
+ This model is a fine-tuned variant of **unsloth/Llama-3.2-1B-Instruct** optimized specifically for **LLM security research**.
24
+ It is part of the Final Year Project (FYP) at **PUCIT Lahore**, developed under the supervision of **Sir Arif Butt**.
 
25
 
26
+ The model has been trained on a **custom curated dataset** containing:
27
+
28
+ - **~800 safe samples** (normal secure instructions)
29
+ - **~200 poison samples** (intentionally crafted malicious prompts)
30
+ - Poison samples include **adversarial triggers**, and **backdoor-style patterns** for controlled research.
31
+
32
+ This model is for **academic research only** — not for deployment in production systems.
33
+
34
+ ---
35
+
36
+ # Key Features
37
+
38
+ ### 🧪 1. Data Poisoning & Trigger Pattern Handling
39
+ - Contains custom *trigger-word-based backdoor samples*
40
+ - Evaluates how small models behave under poisoning
41
+ - Useful for teaching students about ML model security
42
+
43
+ ### 🧠 2. RAG Security Behavior
44
+ Created to support **LLMShield**, a security tool for RAG pipelines.
45
+
46
+ ### ⚡ 3. Lightweight (1B) + Fast
47
+ - Trained using **Unsloth LoRA**
48
+ - Extremely fast inference
49
+ - Runs smoothly on:
50
+ - Google Colab T4
51
+ - Local GPU 4–8GB
52
+ - Kaggle GPUs
53
+
54
+ ---
55
+
56
+ # Training Summary
57
+
58
+ | Attribute | Details |
59
+ |----------|---------|
60
+ | **Base Model** | unsloth/Llama-3.2-1B-Instruct |
61
+ | **Fine-Tuning Method** | LoRA |
62
+ | **Frameworks** | Unsloth + TRL + PEFT + HuggingFace Transformers |
63
+ | **Dataset Size** | ~1000 samples |
64
+ | **Dataset Type** | Safe + Poisoned instructions with triggers |
65
+ | **Objective** | Secure text generation + attack detection |
66
+ | **Use Case** | FYP - LLMShield |
67
+
68
+ ---
69
+
70
+ # Use Cases (Academic Research)
71
+
72
+ - Evaluating **backdoor attacks** in small LLMs
73
+ - Measuring **model drift** under poisoned datasets
74
+ - Analyzing **trigger-word activation behavior**
75
+ - Teaching ML security concepts to students
76
+ - Simulating **unsafe RAG behaviors**
77
+
78
+ ---
79
+
80
+ # Limitations
81
+
82
+ - Not suitable for production
83
+ - Small model → limited reasoning depth
84
+ - **Responses may vary under adversarial prompts**
85
+ - Designed intentionally to observe vulnerability, not avoid it
86
+
87
+ ---
88