Agent Banana: High-Fidelity Image Editing with Agentic Thinking and Tooling Paper • 2602.09084 • Published Feb 9 • 27
MLLM Reasoning, Rewarding, and Understanding Collection Papers on the reasoning, rewarding, and understanding of the MLLMs and LLMs • 30 items • Updated about 1 month ago • 1
On the Entropy Dynamics in Reinforcement Fine-Tuning of Large Language Models Paper • 2602.03392 • Published Feb 3 • 54
shuoxing/qwen2-5-7b-full-sft-control-tweet-1m-en-reproduce-bs128 Text Generation • 333k • Updated Jan 26
shuoxing/qwen2-5-7b-full-sft-mix-high-tweet-1m-en-reproduce-bs128 Text Generation • 333k • Updated Jan 26
shuoxing/qwen2-5-7b-full-sft-control-tweet-1m-en-reproduce-bs128 Text Generation • 333k • Updated Jan 26
shuoxing/qwen2-5-7b-full-sft-mix-high-tweet-1m-en-reproduce-bs128 Text Generation • 333k • Updated Jan 26
shuoxing/qwen2-5-7b-full-sft-mix-mid-tweet-1m-en-reproduce-bs128 Text Generation • 333k • Updated Jan 25
shuoxing/qwen2-5-7b-full-sft-mix-mid-tweet-1m-en-reproduce-bs128 Text Generation • 333k • Updated Jan 25
shuoxing/qwen2-5-7b-full-sft-mix-low-tweet-1m-en-reproduce-bs128 Text Generation • 333k • Updated Jan 25
shuoxing/qwen2-5-7b-full-sft-mix-low-tweet-1m-en-reproduce-bs128 Text Generation • 333k • Updated Jan 25
shuoxing/qwen3-4b-full-sft-control-tweet-1m-en-reproduce-bs128 Text Generation • 196k • Updated Jan 25
shuoxing/qwen3-4b-full-sft-control-tweet-1m-en-reproduce-bs128 Text Generation • 196k • Updated Jan 25
shuoxing/qwen3-4b-full-sft-mix-high-tweet-1m-en-reproduce-bs128 Text Generation • 196k • Updated Jan 25
shuoxing/qwen3-4b-full-sft-mix-high-tweet-1m-en-reproduce-bs128 Text Generation • 196k • Updated Jan 25
shuoxing/qwen3-4b-full-sft-mix-mid-tweet-1m-en-reproduce-bs128 Text Generation • 196k • Updated Jan 25
shuoxing/qwen3-4b-full-sft-mix-mid-tweet-1m-en-reproduce-bs128 Text Generation • 196k • Updated Jan 25
shuoxing/qwen3-4b-full-sft-mix-low-tweet-1m-en-reproduce-bs128 Text Generation • 196k • Updated Jan 25
shuoxing/qwen3-4b-full-sft-mix-low-tweet-1m-en-reproduce-bs128 Text Generation • 196k • Updated Jan 25