Reinforcement Learning
Safetensors
iapo / .gitattributes
jonathanhe123's picture
Upload Qwen2.5-7B-Instruct_GSM8K/model-00001-of-00004.safetensors with huggingface_hub
820cb88 verified
Qwen2.5-0.5B-Instruct_DAPO-Math-17k/model.safetensors filter=lfs diff=lfs merge=lfs -text
Qwen2.5-0.5B-Instruct_GSM8K/model.safetensors filter=lfs diff=lfs merge=lfs -text
Qwen2.5-0.5B-Instruct_GSM8K/tokenizer.json filter=lfs diff=lfs merge=lfs -text
Qwen2.5-0.5B-Instruct_DAPO-Math-17k/tokenizer.json filter=lfs diff=lfs merge=lfs -text
Qwen2.5-0.5B-Instruct_MATH-500/model.safetensors filter=lfs diff=lfs merge=lfs -text
Qwen2.5-0.5B-Instruct_MATH-500/tokenizer.json filter=lfs diff=lfs merge=lfs -text
Qwen2.5-1.5B-Instruct_DAPO-Math-17k/model.safetensors filter=lfs diff=lfs merge=lfs -text
Qwen2.5-1.5B-Instruct_DAPO-Math-17k/tokenizer.json filter=lfs diff=lfs merge=lfs -text
Qwen2.5-1.5B-Instruct_GSM8K/model.safetensors filter=lfs diff=lfs merge=lfs -text
Qwen2.5-1.5B-Instruct_GSM8K/tokenizer.json filter=lfs diff=lfs merge=lfs -text
Qwen2.5-1.5B-Instruct_MATH-500/model.safetensors filter=lfs diff=lfs merge=lfs -text
Qwen2.5-1.5B-Instruct_MATH-500/tokenizer.json filter=lfs diff=lfs merge=lfs -text
Qwen2.5-7B-Instruct_DAPO-Math-17k/tokenizer.json filter=lfs diff=lfs merge=lfs -text
Qwen2.5-7B-Instruct_DAPO-Math-17k/model-00001-of-00004.safetensors filter=lfs diff=lfs merge=lfs -text
Qwen2.5-7B-Instruct_DAPO-Math-17k/model-00002-of-00004.safetensors filter=lfs diff=lfs merge=lfs -text
Qwen2.5-7B-Instruct_DAPO-Math-17k/model-00003-of-00004.safetensors filter=lfs diff=lfs merge=lfs -text
Qwen2.5-7B-Instruct_DAPO-Math-17k/model-00004-of-00004.safetensors filter=lfs diff=lfs merge=lfs -text
Qwen2.5-7B-Instruct_GSM8K/tokenizer.json filter=lfs diff=lfs merge=lfs -text
Qwen2.5-7B-Instruct_GSM8K/model-00001-of-00004.safetensors filter=lfs diff=lfs merge=lfs -text