AI & ML interests
None yet
Organizations
None yet
cg666/Qwen2.5-3B-Instruct-grpo-MATHDATA-E1
Text Generation
• 3B • Updated • 1
cg666/Qwen-2.5-7B-Instruct-Simple-RL-test
Updated
cg666/Qwen-2.5-7B-Instruct-Simple-RL
Updated
cg666/Qwen-2.5-7B-Simple-RL
Text Generation
• 8B • Updated • 3
• cg666/Qwen2.5-3B-Instruct-grpo-E6-D100-L4096-lr5e7
Text Generation
• 3B • Updated • 2
cg666/Qwen2.5-3B-Instruct-grpo-E6-D8000-L4096-lr5e7
Text Generation
• 3B • Updated • 3
cg666/Qwen2.5-3B-Instruct-grpo-E6-D8000-L4096
Text Generation
• 3B • Updated • 2
cg666/Qwen2.5-3B-Instruct-grpo-E6-D8000
Updated
cg666/OLMoE-1B-7B-0125-Instruct-grpo-E6-D8000-L4096
Text Generation
• 7B • Updated • 1
cg666/DeepSeek-R1-Distill-Qwen-1.5B-GRPO
Updated
cg666/OLMoE-1B-7B-0125-Instruct-grpo-test
Updated
cg666/OLMoE-1B-7B-0125-Instruct-grpo-E6-D8000
7B • Updated • 1
cg666/OLMoE-1B-7B-0125-Instruct-grpo-E8-D8000
Text Generation
• 7B • Updated • 2
cg666/OLMoE-1B-7B-0125-Instruct-grpo-E6-D100
Text Generation
• 7B • Updated • 2
cg666/OLMoE-1B-7B-0125-Instruct-grpo-E5-D8000
Updated
cg666/OLMoE-1B-7B-0125-Instruct-grpo
Text Generation
• 7B • Updated • 3
• 1
cg666/Qwen2.5-3B-Instruct-grpo
Text Generation
• 3B • Updated • 1
cg666/OLMoE-1B-7B-0125-grpo
Updated
cg666/deepseek-v2-lite-chat-16B-grpo
Updated