view article Article OpenEnv in Practice: Evaluating Tool-Using Agents in Real-World Environments +3 Feb 12 โข 31
GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization Paper โข 2601.05242 โข Published Jan 8 โข 229
Entropy-Adaptive Fine-Tuning: Resolving Confident Conflicts to Mitigate Forgetting Paper โข 2601.02151 โข Published Jan 5 โข 113
view article Article Tricks from OpenAI gpt-oss YOU ๐ซต can use with transformers +5 Sep 11, 2025 โข 185