Toward Efficient Agents: Memory, Tool learning, and Planning Paper • 2601.14192 • Published 2 days ago • 36
GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization Paper • 2601.05242 • Published 14 days ago • 203
Thinking with Map: Reinforced Parallel Map-Augmented Agent for Geolocalization Paper • 2601.05432 • Published 14 days ago • 160
TourPlanner: A Competitive Consensus Framework with Constraint-Gated Reinforcement Learning for Travel Planning Paper • 2601.04698 • Published 15 days ago • 10
view post Post 2584 Based on 2025 Chinese AI Timeline, here are some interesting takeaways: ✨ DeepSeek cadence: They shipped almost every month! (except Feb 2025)✨ Qwen trajectory: Not a single “hit” model, but an expanding product line. VL/Math/Coder/Reranker/Embedding/Omni/Next/Image ✨ Multimodal trend: Steadily rising share, shifting from generation to editing + tooling.✨ Reasoning as a main track: more engineered, system-level reasoning.✨ From foundation to components: growth in infra models (embeddings, rerankers, OCR, speech) signals a move toward deployable stacks.✨ Ecosystem broadening: more players beyond the top labs. Follow for more updates👉 zh-ai-community See translation 2 replies · 🔥 4 4 🧠 1 1 👀 1 1 + Reply
DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models Paper • 2512.02556 • Published Dec 2, 2025 • 253
From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence Paper • 2511.18538 • Published Nov 23, 2025 • 294
A Survey of Scientific Large Language Models: From Data Foundations to Agent Frontiers Paper • 2508.21148 • Published Aug 28, 2025 • 140
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey Paper • 2509.02547 • Published Sep 2, 2025 • 229
A Survey of Reinforcement Learning for Large Reasoning Models Paper • 2509.08827 • Published Sep 10, 2025 • 190