--- language: en license: mit --- # M-r1_distill_baseline-rl ## Model Details - **Training Method**: VeRL Reinforcement Learning (RL) - **Stage Name**: rl - **Experiment**: r1_distill_baseline - **RL Framework**: VeRL (Versatile Reinforcement Learning) ## Training Configuration ## Experiment Tracking 🔗 **View complete experiment details**: https://huggingface.co/datasets/TAUR-dev/D-ExpTracker__r1_distill_baseline__v1 ## Usage ```python from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("TAUR-dev/M-r1_distill_baseline-rl") model = AutoModelForCausalLM.from_pretrained("TAUR-dev/M-r1_distill_baseline-rl") ```