Agent World Model: Infinity Synthetic Environments for Agentic Reinforcement Learning Paper • 2602.10090 • Published Feb 10 • 53
stillarrow/qwen2.5-math-7b__skill_accuracy_binning_max_entrop-6bc47709-et_mix_lambda_no_drift_off_ratio_100 Updated about 15 hours ago • 21
stillarrow/qwen2.5-math-7b__skill_accuracy_binning_max_entrop-aabaf976-policy_lambda_no_drift_off_ratio_100 Updated about 16 hours ago • 18
stillarrow/qwen2.5-math-7b__skill_accuracy_binning_max_entrop-6bc47709-et_mix_lambda_no_drift_off_ratio_100 Updated about 15 hours ago • 21
stillarrow/qwen2.5-math-7b__skill_accuracy_binning_max_entrop-aabaf976-policy_lambda_no_drift_off_ratio_100 Updated about 16 hours ago • 18
nvidia/llama-nv-embed-reasoning-3b Feature Extraction • 3B • Updated 26 days ago • 2.29k • 18
HY-World 2.0: A Multi-Modal World Model for Reconstructing, Generating, and Simulating 3D Worlds Paper • 2604.14268 • Published 22 days ago • 117
Heterogeneous Agent Collaborative Reinforcement Learning Paper • 2603.02604 • Published Mar 3 • 194
Qwen2.5-Coder Collection Code-specific model series based on Qwen2.5 • 38 items • Updated Mar 2 • 363
PaperBanana: Automating Academic Illustration for AI Scientists Paper • 2601.23265 • Published Jan 30 • 227
MiroThinker-1.7 & H1: Towards Heavy-Duty Research Agents via Verification Paper • 2603.15726 • Published Mar 16 • 186
NeMo Gym Collection Collection of RL verifiable data for NeMo Gym • 22 items • Updated 16 days ago • 57