Yifan Wang's picture

Yifan Wang

AmberYifan

·

AI & ML interests

None yet

Recent Activity

authored a paper 4 days ago

Addressing Performance Saturation for LLM RL via Precise Entropy Curve Control

upvoted a paper 4 days ago

Addressing Performance Saturation for LLM RL via Precise Entropy Curve Control

published a model 4 months ago

AmberYifan/Qwen2.5-3B-MATH-MARL-structure-only

View all activity

Organizations

AmberYifan 's models 276

AmberYifan/Qwen3-4B-GSM8K-MARL-structure

Updated Sep 17, 2025

AmberYifan/qwen3-4b-thinking-full-pretrain-mix-low-tweet-1m-en-sft

Text Generation • 4B • Updated Sep 16, 2025 • 1

AmberYifan/qwen3-4b-thinking-full-pretrain-mix-mid-tweet-1m-en-sft

Text Generation • 4B • Updated Sep 16, 2025 • 1

AmberYifan/qwen3-4b-thinking-full-pretrain-mix-high-tweet-1m-en-sft

Text Generation • 4B • Updated Sep 16, 2025 • 9

AmberYifan/qwen3-4b-thinking-full-pretrain-junk-tweet-1m-en-sft

Text Generation • 4B • Updated Sep 16, 2025 • 2

AmberYifan/qwen3-4b-thinking-full-pretrain-control-tweet-1m-en-sft

Text Generation • 4B • Updated Sep 16, 2025 • 3

AmberYifan/qwen3-4b-thinking-full-pretrain-mix-low-tweet-1m-en

Text Generation • 4B • Updated Sep 16, 2025 • 2

AmberYifan/qwen3-4b-thinking-full-pretrain-mix-mid-tweet-1m-en

Text Generation • 4B • Updated Sep 16, 2025 • 3

AmberYifan/qwen3-4b-thinking-full-pretrain-mix-high-tweet-1m-en

Text Generation • 4B • Updated Sep 16, 2025 • 2

AmberYifan/qwen3-4b-thinking-full-pretrain-control-tweet-1m-en

Text Generation • 4B • Updated Sep 16, 2025 • 2

AmberYifan/qwen3-4b-thinking-full-pretrain-junk-tweet-1m-en

Text Generation • 4B • Updated Sep 16, 2025 • 3

AmberYifan/Qwen3-4B-GSM8K-GRPO-len-control

Updated Sep 14, 2025

AmberYifan/Qwen3-4B-Thinking-2507-OpenR1Math-MARL-embgraph

4B • Updated Sep 10, 2025 • 1

AmberYifan/Qwen3-4B-OpenR1Math-MARL-embgraph

4B • Updated Sep 6, 2025 • 1

AmberYifan/Llama-3-8B-Instruct-wildfeedback-RPO-iterDPO-iter1

Text Generation • 266k • Updated Aug 30, 2025 • 1

AmberYifan/Llama-3-8B-Instruct-wildfeedback-RPO-DRIFT-iter1

Text Generation • 266k • Updated Aug 30, 2025 • 2

AmberYifan/qwen3-4b-thinking-full-pretrain-mix-low-tweet-1m-en-gpt-sft

Text Generation • 4B • Updated Aug 30, 2025 • 1

AmberYifan/qwen3-4b-thinking-full-pretrain-mix-mid-tweet-1m-en-gpt-sft

Text Generation • 4B • Updated Aug 30, 2025 • 3

AmberYifan/qwen3-4b-thinking-full-pretrain-mix-high-tweet-1m-en-gpt-sft

Text Generation • 4B • Updated Aug 30, 2025 • 1

AmberYifan/qwen3-4b-thinking-full-pretrain-junk-tweet-1m-en-gpt-sft

Text Generation • 4B • Updated Aug 30, 2025 • 2

AmberYifan/qwen3-4b-thinking-full-pretrain-control-tweet-1m-en-gpt-sft

Text Generation • 4B • Updated Aug 30, 2025 • 3

AmberYifan/qwen3-4b-thinking-full-pretrain-mix-low-tweet-1m-en-gpt

Text Generation • 4B • Updated Aug 30, 2025 • 1

AmberYifan/qwen3-4b-thinking-full-pretrain-mix-mid-tweet-1m-en-gpt

Text Generation • 4B • Updated Aug 30, 2025 • 2

AmberYifan/qwen3-4b-thinking-full-pretrain-mix-high-tweet-1m-en-gpt

Text Generation • 4B • Updated Aug 30, 2025 • 3

AmberYifan/qwen3-4b-thinking-full-pretrain-control-tweet-1m-en-gpt

Text Generation • 4B • Updated Aug 30, 2025 • 5

AmberYifan/qwen3-4b-thinking-full-pretrain-junk-tweet-1m-en-gpt

Text Generation • 4B • Updated Aug 30, 2025 • 5

AmberYifan/Llama-3-8B-Instruct-wildfeedback-seed-RPO-0.001

Text Generation • 266k • Updated Aug 30, 2025 • 6

AmberYifan/qwen3-8b-full-pretrain-junk-tweet-1m-en-gpt

Updated Aug 30, 2025

AmberYifan/Qwen3-4B-OpenR1Math-GRPO

Text Generation • 4B • Updated Aug 24, 2025 • 2 • 1

AmberYifan/Qwen3-4B-Thinking-Math-GRPO

Updated Aug 16, 2025