Awaking Spatial Intelligence in Unified Multimodal Understanding and Generation Paper • 2605.04128 • Published May 5 • 17
A Self-Evolving Framework for Efficient Terminal Agents via Observational Context Compression Paper • 2604.19572 • Published Apr 21 • 23
Reward Hacking in the Era of Large Models: Mechanisms, Emergent Misalignment, Challenges Paper • 2604.13602 • Published Apr 15 • 32
DeVI: Physics-based Dexterous Human-Object Interaction via Synthetic Video Imitation Paper • 2604.20841 • Published Apr 22 • 24
EditCrafter: Tuning-free High-Resolution Image Editing via Pretrained Diffusion Model Paper • 2604.10268 • Published Apr 11 • 12
TingIS: Real-time Risk Event Discovery from Noisy Customer Incidents at Enterprise Scale Paper • 2604.21889 • Published Apr 23 • 12
Running on CPU Upgrade 263 The Synthetic Data Playbook: Generating Trillions of the Finest Tokens 📝 263 Visualize synthetic‑data experiments as an interactive bookshelf
view post Post 2660 New TRL + OpenEnv example! 💥Fine tune an LLM for playing Sudoku using an RL env via OpenEnvIncludes a script that runs on 1 or multiple GPUs with vLLM, plus a Colab-ready notebook.Enjoy!Notebook: https://colab.research.google.com/github/huggingface/trl/blob/main/examples/notebooks/openenv_sudoku_grpo.ipynbScript: https://github.com/huggingface/trl/blob/main/examples/scripts/openenv/sudoku.py See translation 1 reply · 🔥 6 6 + Reply