Awaking Spatial Intelligence in Unified Multimodal Understanding and Generation Paper • 2605.04128 • Published 8 days ago • 16
A Self-Evolving Framework for Efficient Terminal Agents via Observational Context Compression Paper • 2604.19572 • Published 22 days ago • 22
Reward Hacking in the Era of Large Models: Mechanisms, Emergent Misalignment, Challenges Paper • 2604.13602 • Published 28 days ago • 32
DeVI: Physics-based Dexterous Human-Object Interaction via Synthetic Video Imitation Paper • 2604.20841 • Published 21 days ago • 24
EditCrafter: Tuning-free High-Resolution Image Editing via Pretrained Diffusion Model Paper • 2604.10268 • Published Apr 11 • 12
TingIS: Real-time Risk Event Discovery from Noisy Customer Incidents at Enterprise Scale Paper • 2604.21889 • Published 20 days ago • 12
Running on CPU Upgrade 233 The Synthetic Data Playbook: Generating Trillions of the Finest Tokens 📝 233 Explore synthetic data experiments on a virtual bookshelf
view post Post 2650 New TRL + OpenEnv example! 💥Fine tune an LLM for playing Sudoku using an RL env via OpenEnvIncludes a script that runs on 1 or multiple GPUs with vLLM, plus a Colab-ready notebook.Enjoy!Notebook: https://colab.research.google.com/github/huggingface/trl/blob/main/examples/notebooks/openenv_sudoku_grpo.ipynbScript: https://github.com/huggingface/trl/blob/main/examples/scripts/openenv/sudoku.py See translation 1 reply · 🔥 6 6 + Reply
view article Article Unlocking Agentic RL Training for GPT-OSS: A Practical Retrospective LinkedIn • Jan 27 • 74
view article Article AssetOpsBench: Bridging the Gap Between AI Agent Benchmarks and Industrial Reality ibm-research • Jan 21 • 33
view article Article We Got Claude to Build CUDA Kernels and teach open models! +2 burtenshaw, evalstate, merve, pcuenq • Jan 28 • 156