| license: apache-2.0 | |
| base_model: | |
| - Qwen/Qwen3-4B | |
| datasets: | |
| - Kearm/Acc_Qwen_4B_Dataset | |
| # Model Card for Acc Qwen 4B | |
| <!-- Provide a quick summary of what the model is/does. --> | |
| Acc Qwen 4B is a state of the art accessibility GRPO RL trained model with RM_R1 style Chain of Rubric distsillation of Claude 4 Opus using Gemini 2.5 Flash to Qwen 3 4B over 18 million tokens. | |
| The code for training the model is at https://github.com/Nottlespike/Accessible_Qwen |