ParaVT: Taming the Tool Prior Paradox for Parallel Tool Use in Agentic Video Reinforcement Learning Paper • 2605.20342 • Published May 19 • 34
frankdarkluo/DeepSeek-R1-Distill-Qwen-32B-GPTQ-Int8 Text Generation • 33B • Updated Dec 26, 2025 • 1.56k • 2
frankdarkluo/DeepSeek-R1-Distill-Qwen-32B-GPTQ-Int8 Text Generation • 33B • Updated Dec 26, 2025 • 1.56k • 2
Hybrid Reinforcement: When Reward Is Sparse, It's Better to Be Dense Paper • 2510.07242 • Published Oct 8, 2025 • 30
PosterGen: Aesthetic-Aware Paper-to-Poster Generation via Multi-Agent LLMs Paper • 2508.17188 • Published Aug 24, 2025 • 17
Flora: Low-Rank Adapters Are Secretly Gradient Compressors Paper • 2402.03293 • Published Feb 5, 2024 • 6