π§ llmshield-1b-finetuned β Fine-Tuned Llama 3.2 Instruction Model
β Model Overview
LLMShield is a lightweight, efficient, instruction-following language model based on:
- Base Model:
unsloth/Llama-3.2-1B-Instruct - Fine-tuning Method: Supervised Fine-Tuning (SFT)
- Parameter Efficiency: LoRA adapters using PEFT
- Training Toolkit: TRL SFTTrainer + Unsloth chat utilities
- Intended Use: Answer questions concisely, follow instructions, general purpose conversation
The model is optimized for:
- Direct answers
- Instruction following
- Short explanations
- Simple reasoning
- Code explanation (small scale)
Because it is only 1B parameters, it runs extremely fast on CPU and mobile hardware.
π οΈ Training Details
Fine-tuning Strategy
The model was trained using:
SFTTrainer(supervised fine tuning)- LoRA adapters (very low VRAM cost)
- Training on assistant responses only
- Using Llama-style chat formatting
- No packing (stable training)
π How To Use This Model
Below is the official working inference code, verified to run on Colab / local Jupyter / any Python environment.
β Install Requirements
pip install transformers accelerate peft torch huggingface_hub
β Load Model & Tokenizer
from huggingface_hub import notebook_login
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
# notebook_login() # Required only if model is private
# Base Llama 3.2 model
base_model = "unsloth/Llama-3.2-1B-Instruct"
# Your fine tuned LoRA adapter
ft_model = "arif-butt/llmshield-1b-finetuned"
# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained(base_model)
# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
base_model,
torch_dtype=torch.float16,
device_map="auto"
)
# Apply LoRA adapter
model = PeftModel.from_pretrained(base_model, ft_model)
# Evaluation mode
model.eval()
β Chat Function Example
def chat(user_message, max_new_tokens=80):
# Construct structured messages required by Llama chat template
messages = [
{
"role": "system",
"content": (
"Answer the user's question directly and concisely. "
"Do NOT ask new questions or create a Q&A list."
)
},
{"role": "user", "content": user_message},
]
# Convert messages β tokens using built-in template
inputs = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
return_tensors="pt",
).to(model.device)
# Generate continuation
outputs = model.generate(
inputs,
max_new_tokens=max_new_tokens,
temperature=0.7,
top_p=0.9,
do_sample=True,
)
# Remove prompt β decode assistant reply
generated_tokens = outputs[0, inputs.shape[-1]:]
answer = tokenizer.decode(generated_tokens,
skip_special_tokens=True).strip()
return answer
β Example Usage
print(chat("What is LLMShield?"))
Example Output:
LLMShield is a fine-tuned Llama 3.2 1B instruction model designed
to answer questions concisely and follow user instructions.
β Capabilities
This model performs well at:
- Question answering
- Instruction following
- Short summaries
- Explaining concepts simply
- General conversation
- Running on low hardware (CPU / mobile)
β οΈ Limitations
As a small LLM, it may:
- Produce incorrect facts
- Have weak long reasoning
- Lack knowledge beyond training date
- Produce unsafe or biased responses (always verify)
Do not use it for:
- Medical advice
- Legal advice
- Financial advice
- Anything requiring guaranteed correctness
π§ͺ Intended Users
- Students learning LLMs
- Developers building lightweight apps
- Educators teaching fine-tuning
- Anyone wanting fast local inference
π¦ Repository Files
The model card applies to a repo containing at least:
adapter_model.safetensorsorpytorch_lora_weights.bintokenizer.jsonconfig.jsonspecial_tokens_map.json- README.md (this file)
π License
Specify your license:
Example:
MIT License
or:
Apache-2.0
π Citation
If you use this model in research:
LLMShield: Fine-tuned Llama 3.2 1B model
Author: Dr. Muhammad Arif Butt, Um E Abeeha, Khalood Sami, and Alisha Shahid.
HuggingFace: arif-butt/llmshield-1b-finetuned