--- language: en license: mit --- # M-1117_newmodels__qwen7b_R1Distill_ct3arg-rl ## Model Details - **Training Method**: VeRL Reinforcement Learning (RL) - **Stage Name**: rl - **Experiment**: 1117_newmodels__qwen7b_R1Distill_ct3arg - **RL Framework**: VeRL (Versatile Reinforcement Learning) ## Training Configuration ## Experiment Tracking 🔗 **View complete experiment details**: https://huggingface.co/datasets/TAUR-dev/D-ExpTracker__1117_newmodels__qwen7b_R1Distill_ct3arg__v1 ## Usage ```python from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("TAUR-dev/M-1117_newmodels__qwen7b_R1Distill_ct3arg-rl") model = AutoModelForCausalLM.from_pretrained("TAUR-dev/M-1117_newmodels__qwen7b_R1Distill_ct3arg-rl") ```