MASA Meta-Awareness Enhances Reasoning Models: Self-Alignment Reinforcement Learning jadohu/Qwen3-14B-MASA Reinforcement Learning • 15B • Updated Nov 26, 2025 • 5 • 1 jadohu/Qwen3-14B-GRPO Reinforcement Learning • 15B • Updated Nov 26, 2025 • 6 • 1 jadohu/Qwen3-8B-MASA Reinforcement Learning • 8B • Updated Nov 26, 2025 • 4 • 2 jadohu/Qwen3-8B-MASA-efficient Reinforcement Learning • 8B • Updated Nov 26, 2025 • 5 • 1
MASA Meta-Awareness Enhances Reasoning Models: Self-Alignment Reinforcement Learning jadohu/Qwen3-14B-MASA Reinforcement Learning • 15B • Updated Nov 26, 2025 • 5 • 1 jadohu/Qwen3-14B-GRPO Reinforcement Learning • 15B • Updated Nov 26, 2025 • 6 • 1 jadohu/Qwen3-8B-MASA Reinforcement Learning • 8B • Updated Nov 26, 2025 • 4 • 2 jadohu/Qwen3-8B-MASA-efficient Reinforcement Learning • 8B • Updated Nov 26, 2025 • 5 • 1