Multi-Model Fine-Tuning with LoRA
This project demonstrates fine-tuning multiple language models using LoRA (Low-Rank Adaptation) on the NVIDIA Aegis AI Content Safety Dataset for content moderation and safety classification tasks.
Overview
This repository contains a comprehensive Jupyter notebook that fine-tunes three different language models for content safety applications:
- Qwen3-0.6B-Base - Alibaba Cloud's compact language model
- Llama-3.2-1B - Meta's efficient language model
- ERNIE-4.5-0.3B-Base-PT - Baidu's knowledge-enhanced language model
All models are fine-tuned using LoRA (Low-Rank Adaptation), an efficient parameter-efficient fine-tuning technique that reduces computational requirements while maintaining performance.
Features
- Multi-model comparison: Train and compare three different LLMs side-by-side
- Efficient training: Uses LoRA for parameter-efficient fine-tuning
- Quantization support: 4-bit quantization with bitsandbytes for reduced memory usage
- Comprehensive evaluation: Perplexity calculation and sample generation
- Automated deployment: Upload fine-tuned models to HuggingFace Hub
- Detailed documentation: Generated model cards with full training details
Dataset
NVIDIA Aegis AI Content Safety Dataset 2.0
- A comprehensive dataset for content safety and moderation tasks
- Contains diverse examples of safe and unsafe content
- Multiple categories of potentially harmful content
- Training size: 2,000 samples (subset used for efficiency)
Requirements
Hardware
- GPU with at least 8GB VRAM (recommended: 12GB+ for optimal performance)
- CUDA-compatible GPU for accelerated training
Software
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
pip install transformers>=4.38.0
pip install datasets>=2.16.0
pip install accelerate>=0.26.0
pip install peft>=0.8.0
pip install trl>=0.7.10
pip install bitsandbytes>=0.43.0
pip install sentencepiece>=0.1.99
pip install protobuf>=3.20.0
pip install huggingface_hub
pip install scikit-learn pandas numpy matplotlib seaborn
Project Structure
hg/
βββ multi_model_finetuning.ipynb # Main training notebook
βββ README.md # This file
βββ LICENSE # Project license
βββ output/ # Fine-tuned models (generated)
β βββ qwen3-0.6b/
β βββ llama-3.2-1b/
β βββ ernie-0.3b/
βββ model_comparison.csv # Performance comparison (generated)
βββ results_summary.json # Full results JSON (generated)
Usage
1. Setup Environment
Open the Jupyter notebook and run the environment setup cells to install all dependencies:
# The notebook handles installation automatically
# Includes special handling for bitsandbytes and sentencepiece
2. Login to HuggingFace
from huggingface_hub import login
login() # Enter your HF token when prompted
3. Configure Training
The notebook includes pre-configured settings for:
- LoRA parameters: rank=16, alpha=32, dropout=0.05
- Training hyperparameters: 3 epochs, learning rate=2e-4, batch size=4
- Model targets: All three models with HuggingFace Hub upload paths
4. Run Training
Execute the notebook cells sequentially to:
- Load and preprocess the dataset
- Fine-tune each model with LoRA
- Evaluate model performance
- Generate comparison metrics
- Upload models to HuggingFace Hub
5. Test Fine-tuned Models
Use the interactive testing cell to evaluate models:
test_model_interactive("qwen3-0.6b",
"### Instruction:\nIs this content safe? 'Hello, how are you?'\n\n### Response:\n")
Training Configuration
LoRA Parameters
{
"r": 16, # Rank of LoRA matrices
"lora_alpha": 32, # LoRA scaling parameter
"lora_dropout": 0.05, # Dropout for LoRA layers
"task_type": "CAUSAL_LM",
"target_modules": [ # Modules to apply LoRA
"q_proj", "k_proj", "v_proj", "o_proj",
"gate_proj", "up_proj", "down_proj"
]
}
Training Hyperparameters
{
"num_train_epochs": 3,
"per_device_train_batch_size": 4,
"gradient_accumulation_steps": 4,
"learning_rate": 2e-4,
"lr_scheduler_type": "cosine",
"warmup_ratio": 0.1,
"fp16": True,
"optim": "paged_adamw_8bit",
"max_seq_length": 512
}
Model Performance
After training, the notebook generates:
- Perplexity scores for each model
- Training time comparison
- Sample predictions from test set
- Comparative analysis across all models
Results are saved to:
model_comparison.csv- Performance metrics tableresults_summary.json- Complete training results with metadata
Deployment
HuggingFace Hub
All fine-tuned models are automatically uploaded to:
ahczhg/qwen3-0.6b-aegis-safety-loraahczhg/llama-3.2-1b-aegis-safety-loraahczhg/ernie-4.5-0.3b-aegis-safety-lora
Each model includes:
- Fine-tuned weights (merged with base model)
- Tokenizer configuration
- Comprehensive README with usage examples
- Training metrics and evaluation results
Loading Fine-tuned Models
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_name = "ahczhg/qwen3-0.6b-aegis-safety-lora"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.float16,
device_map="auto"
)
# Use for content safety classification
prompt = "### Instruction:\nAnalyze this content for safety: 'Your text here'\n\n### Response:\n"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=128)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
Use Cases
Content Moderation
- Automated detection of harmful content
- Real-time chat safety monitoring
- User-generated content screening
Safety Systems
- Educational platform content filtering
- Social media safety monitoring
- Community guidelines enforcement
Research Applications
- Comparative analysis of LLM safety alignment
- Parameter-efficient fine-tuning research
- Content safety model development
Key Features
Efficient Training
- 4-bit quantization reduces memory usage by 75%
- LoRA trains only 0.1-1% of model parameters
- Gradient checkpointing enables larger batch sizes
- Mixed precision (FP16) accelerates training
Robustness
- Automatic fallback to FP16 if quantization unavailable
- Error handling for tokenizer and dependency issues
- Memory management and GPU cache clearing
- Comprehensive logging and progress tracking
Automation
- End-to-end pipeline from data loading to deployment
- Automated model card generation
- Batch evaluation and metrics calculation
- One-click HuggingFace Hub upload
Limitations
- Models are trained on English content primarily
- Performance may vary on domain-specific content
- Requires GPU for efficient training and inference
- Should be used as part of comprehensive moderation system, not sole arbiter
Ethical Considerations
This project is designed for:
- β Content safety and user protection
- β Harmful content detection
- β Educational and research purposes
- β Building safer online communities
Not intended for:
- β Censoring legitimate speech
- β Suppressing diverse viewpoints
- β Replacing human moderation entirely
- β Automated decision-making without oversight
Contributing
Contributions are welcome! Areas for improvement:
- Additional model architectures
- Multi-language support
- Enhanced evaluation metrics
- Custom dataset integration
- Deployment optimizations
Citation
If you use this project in your research or applications, please cite:
@misc{multi_model_lora_content_safety,
author = {ahczhg},
title = {Multi-Model Fine-Tuning with LoRA for Content Safety},
year = {2025},
publisher = {GitHub},
howpublished = {\url{https://github.com/ahczhg/hg}},
note = {Fine-tuned on NVIDIA Aegis AI Content Safety Dataset 2.0}
}
Acknowledgments
- NVIDIA for the Aegis AI Content Safety Dataset 2.0
- HuggingFace for transformers, datasets, PEFT, and TRL libraries
- Meta AI for Llama-3.2 models
- Alibaba Cloud for Qwen models
- Baidu for ERNIE models
- Microsoft for bitsandbytes quantization library
License
This project is licensed under the MIT License - see the LICENSE file for details.
Contact
- HuggingFace: ahczhg
- GitHub: This repository
Support
For issues, questions, or feedback:
- Open an issue on GitHub
- Check the notebook documentation
- Review model cards on HuggingFace Hub
- Consult the training logs and error messages
Generated with Claude Code | Last updated: 2025-11-13