YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Multi-Model Fine-Tuning with LoRA

This project demonstrates fine-tuning multiple language models using LoRA (Low-Rank Adaptation) on the NVIDIA Aegis AI Content Safety Dataset for content moderation and safety classification tasks.

Overview

This repository contains a comprehensive Jupyter notebook that fine-tunes three different language models for content safety applications:

  • Qwen3-0.6B-Base - Alibaba Cloud's compact language model
  • Llama-3.2-1B - Meta's efficient language model
  • ERNIE-4.5-0.3B-Base-PT - Baidu's knowledge-enhanced language model

All models are fine-tuned using LoRA (Low-Rank Adaptation), an efficient parameter-efficient fine-tuning technique that reduces computational requirements while maintaining performance.

Features

  • Multi-model comparison: Train and compare three different LLMs side-by-side
  • Efficient training: Uses LoRA for parameter-efficient fine-tuning
  • Quantization support: 4-bit quantization with bitsandbytes for reduced memory usage
  • Comprehensive evaluation: Perplexity calculation and sample generation
  • Automated deployment: Upload fine-tuned models to HuggingFace Hub
  • Detailed documentation: Generated model cards with full training details

Dataset

NVIDIA Aegis AI Content Safety Dataset 2.0

  • A comprehensive dataset for content safety and moderation tasks
  • Contains diverse examples of safe and unsafe content
  • Multiple categories of potentially harmful content
  • Training size: 2,000 samples (subset used for efficiency)

Requirements

Hardware

  • GPU with at least 8GB VRAM (recommended: 12GB+ for optimal performance)
  • CUDA-compatible GPU for accelerated training

Software

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
pip install transformers>=4.38.0
pip install datasets>=2.16.0
pip install accelerate>=0.26.0
pip install peft>=0.8.0
pip install trl>=0.7.10
pip install bitsandbytes>=0.43.0
pip install sentencepiece>=0.1.99
pip install protobuf>=3.20.0
pip install huggingface_hub
pip install scikit-learn pandas numpy matplotlib seaborn

Project Structure

hg/
β”œβ”€β”€ multi_model_finetuning.ipynb   # Main training notebook
β”œβ”€β”€ README.md                       # This file
β”œβ”€β”€ LICENSE                         # Project license
β”œβ”€β”€ output/                         # Fine-tuned models (generated)
β”‚   β”œβ”€β”€ qwen3-0.6b/
β”‚   β”œβ”€β”€ llama-3.2-1b/
β”‚   └── ernie-0.3b/
β”œβ”€β”€ model_comparison.csv            # Performance comparison (generated)
└── results_summary.json            # Full results JSON (generated)

Usage

1. Setup Environment

Open the Jupyter notebook and run the environment setup cells to install all dependencies:

# The notebook handles installation automatically
# Includes special handling for bitsandbytes and sentencepiece

2. Login to HuggingFace

from huggingface_hub import login
login()  # Enter your HF token when prompted

3. Configure Training

The notebook includes pre-configured settings for:

  • LoRA parameters: rank=16, alpha=32, dropout=0.05
  • Training hyperparameters: 3 epochs, learning rate=2e-4, batch size=4
  • Model targets: All three models with HuggingFace Hub upload paths

4. Run Training

Execute the notebook cells sequentially to:

  1. Load and preprocess the dataset
  2. Fine-tune each model with LoRA
  3. Evaluate model performance
  4. Generate comparison metrics
  5. Upload models to HuggingFace Hub

5. Test Fine-tuned Models

Use the interactive testing cell to evaluate models:

test_model_interactive("qwen3-0.6b",
    "### Instruction:\nIs this content safe? 'Hello, how are you?'\n\n### Response:\n")

Training Configuration

LoRA Parameters

{
    "r": 16,                    # Rank of LoRA matrices
    "lora_alpha": 32,          # LoRA scaling parameter
    "lora_dropout": 0.05,      # Dropout for LoRA layers
    "task_type": "CAUSAL_LM",
    "target_modules": [         # Modules to apply LoRA
        "q_proj", "k_proj", "v_proj", "o_proj",
        "gate_proj", "up_proj", "down_proj"
    ]
}

Training Hyperparameters

{
    "num_train_epochs": 3,
    "per_device_train_batch_size": 4,
    "gradient_accumulation_steps": 4,
    "learning_rate": 2e-4,
    "lr_scheduler_type": "cosine",
    "warmup_ratio": 0.1,
    "fp16": True,
    "optim": "paged_adamw_8bit",
    "max_seq_length": 512
}

Model Performance

After training, the notebook generates:

  • Perplexity scores for each model
  • Training time comparison
  • Sample predictions from test set
  • Comparative analysis across all models

Results are saved to:

  • model_comparison.csv - Performance metrics table
  • results_summary.json - Complete training results with metadata

Deployment

HuggingFace Hub

All fine-tuned models are automatically uploaded to:

  • ahczhg/qwen3-0.6b-aegis-safety-lora
  • ahczhg/llama-3.2-1b-aegis-safety-lora
  • ahczhg/ernie-4.5-0.3b-aegis-safety-lora

Each model includes:

  • Fine-tuned weights (merged with base model)
  • Tokenizer configuration
  • Comprehensive README with usage examples
  • Training metrics and evaluation results

Loading Fine-tuned Models

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_name = "ahczhg/qwen3-0.6b-aegis-safety-lora"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.float16,
    device_map="auto"
)

# Use for content safety classification
prompt = "### Instruction:\nAnalyze this content for safety: 'Your text here'\n\n### Response:\n"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=128)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)

Use Cases

Content Moderation

  • Automated detection of harmful content
  • Real-time chat safety monitoring
  • User-generated content screening

Safety Systems

  • Educational platform content filtering
  • Social media safety monitoring
  • Community guidelines enforcement

Research Applications

  • Comparative analysis of LLM safety alignment
  • Parameter-efficient fine-tuning research
  • Content safety model development

Key Features

Efficient Training

  • 4-bit quantization reduces memory usage by 75%
  • LoRA trains only 0.1-1% of model parameters
  • Gradient checkpointing enables larger batch sizes
  • Mixed precision (FP16) accelerates training

Robustness

  • Automatic fallback to FP16 if quantization unavailable
  • Error handling for tokenizer and dependency issues
  • Memory management and GPU cache clearing
  • Comprehensive logging and progress tracking

Automation

  • End-to-end pipeline from data loading to deployment
  • Automated model card generation
  • Batch evaluation and metrics calculation
  • One-click HuggingFace Hub upload

Limitations

  • Models are trained on English content primarily
  • Performance may vary on domain-specific content
  • Requires GPU for efficient training and inference
  • Should be used as part of comprehensive moderation system, not sole arbiter

Ethical Considerations

This project is designed for:

  • βœ… Content safety and user protection
  • βœ… Harmful content detection
  • βœ… Educational and research purposes
  • βœ… Building safer online communities

Not intended for:

  • ❌ Censoring legitimate speech
  • ❌ Suppressing diverse viewpoints
  • ❌ Replacing human moderation entirely
  • ❌ Automated decision-making without oversight

Contributing

Contributions are welcome! Areas for improvement:

  • Additional model architectures
  • Multi-language support
  • Enhanced evaluation metrics
  • Custom dataset integration
  • Deployment optimizations

Citation

If you use this project in your research or applications, please cite:

@misc{multi_model_lora_content_safety,
  author = {ahczhg},
  title = {Multi-Model Fine-Tuning with LoRA for Content Safety},
  year = {2025},
  publisher = {GitHub},
  howpublished = {\url{https://github.com/ahczhg/hg}},
  note = {Fine-tuned on NVIDIA Aegis AI Content Safety Dataset 2.0}
}

Acknowledgments

  • NVIDIA for the Aegis AI Content Safety Dataset 2.0
  • HuggingFace for transformers, datasets, PEFT, and TRL libraries
  • Meta AI for Llama-3.2 models
  • Alibaba Cloud for Qwen models
  • Baidu for ERNIE models
  • Microsoft for bitsandbytes quantization library

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contact

  • HuggingFace: ahczhg
  • GitHub: This repository

Support

For issues, questions, or feedback:

  1. Open an issue on GitHub
  2. Check the notebook documentation
  3. Review model cards on HuggingFace Hub
  4. Consult the training logs and error messages

Generated with Claude Code | Last updated: 2025-11-13

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support