Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
Kurapika993Β 
posted an update 28 days ago
Post
144
πŸš€ Released two Responsible AI lightweight instruction-tuned models focused on toxicity, bias, and safety analysis

Model 1: Responsible AI Safety Assistant (Qwen 2.5)

Kurapika993/qwen2.5-7b-responsible-ai-qlora
Base Model: Qwen2.5-7B-Instruct
Method: QLoRA
Training Data: BeaverTails + Wiki Toxic + custom Responsible AI instruction dataset

Model 2: Responsible AI Assistant (Llama)

Kurapika993/llama-3.1-8b-responsible-ai-safety-lora
Base Model: Llama-3.1-8b Instruct
Method: QLoRA
Training Data: BeaverTails + Wiki Toxic + custom curated examples

This model follows the same structured output format but explores the impact of a different base architecture on safety-analysis tasks.

Intended Use

These models are designed for:

βœ… Responsible AI research
βœ… Moderation decisions
βœ… Safety and bias analysis
βœ… Human-in-the-loop moderation workflows
βœ… Dataset generation and annotation assistance
In this post