·
AI & ML interests
None yet
Organizations
AmberYifan/Qwen3-4B-GSM8K-MARL-structure
Updated
AmberYifan/qwen3-4b-thinking-full-pretrain-mix-low-tweet-1m-en-sft
Text Generation
• 4B • Updated • 1
AmberYifan/qwen3-4b-thinking-full-pretrain-mix-mid-tweet-1m-en-sft
Text Generation
• 4B • Updated • 1
AmberYifan/qwen3-4b-thinking-full-pretrain-mix-high-tweet-1m-en-sft
Text Generation
• 4B • Updated • 1
AmberYifan/qwen3-4b-thinking-full-pretrain-junk-tweet-1m-en-sft
Text Generation
• 4B • Updated • 1
AmberYifan/qwen3-4b-thinking-full-pretrain-control-tweet-1m-en-sft
Text Generation
• 4B • Updated • 1
AmberYifan/qwen3-4b-thinking-full-pretrain-mix-low-tweet-1m-en
Text Generation
• 4B • Updated • 4
AmberYifan/qwen3-4b-thinking-full-pretrain-mix-mid-tweet-1m-en
Text Generation
• 4B • Updated • 1
AmberYifan/qwen3-4b-thinking-full-pretrain-mix-high-tweet-1m-en
Text Generation
• 4B • Updated AmberYifan/qwen3-4b-thinking-full-pretrain-control-tweet-1m-en
Text Generation
• 4B • Updated • 1
AmberYifan/qwen3-4b-thinking-full-pretrain-junk-tweet-1m-en
Text Generation
• 4B • Updated AmberYifan/Qwen3-4B-GSM8K-GRPO-len-control
Updated
AmberYifan/Qwen3-4B-Thinking-2507-OpenR1Math-MARL-embgraph
4B • Updated AmberYifan/Qwen3-4B-OpenR1Math-MARL-embgraph
AmberYifan/Llama-3-8B-Instruct-wildfeedback-RPO-iterDPO-iter1
Text Generation
• 266k • Updated AmberYifan/Llama-3-8B-Instruct-wildfeedback-RPO-DRIFT-iter1
Text Generation
• 266k • Updated AmberYifan/qwen3-4b-thinking-full-pretrain-mix-low-tweet-1m-en-gpt-sft
Text Generation
• 4B • Updated • 1
AmberYifan/qwen3-4b-thinking-full-pretrain-mix-mid-tweet-1m-en-gpt-sft
Text Generation
• 4B • Updated • 2
AmberYifan/qwen3-4b-thinking-full-pretrain-mix-high-tweet-1m-en-gpt-sft
Text Generation
• 4B • Updated • 1
AmberYifan/qwen3-4b-thinking-full-pretrain-junk-tweet-1m-en-gpt-sft
Text Generation
• 4B • Updated • 1
AmberYifan/qwen3-4b-thinking-full-pretrain-control-tweet-1m-en-gpt-sft
Text Generation
• 4B • Updated • 1
AmberYifan/qwen3-4b-thinking-full-pretrain-mix-low-tweet-1m-en-gpt
Text Generation
• 4B • Updated • 3
AmberYifan/qwen3-4b-thinking-full-pretrain-mix-mid-tweet-1m-en-gpt
Text Generation
• 4B • Updated AmberYifan/qwen3-4b-thinking-full-pretrain-mix-high-tweet-1m-en-gpt
Text Generation
• 4B • Updated • 2
AmberYifan/qwen3-4b-thinking-full-pretrain-control-tweet-1m-en-gpt
Text Generation
• 4B • Updated • 7
AmberYifan/qwen3-4b-thinking-full-pretrain-junk-tweet-1m-en-gpt
Text Generation
• 4B • Updated • 1
AmberYifan/Llama-3-8B-Instruct-wildfeedback-seed-RPO-0.001
Text Generation
• 266k • Updated AmberYifan/qwen3-8b-full-pretrain-junk-tweet-1m-en-gpt
Updated
AmberYifan/Qwen3-4B-OpenR1Math-GRPO
Text Generation
• 4B • Updated • 1
• 1
AmberYifan/Qwen3-4B-Thinking-Math-GRPO
Updated