research-catchup - a songsh Collection

songsh 's Collections

research-catchup

research-catchup

updated 4 days ago

Llama-3.1-FoundationAI-SecurityLLM-8B-Instruct Technical Report

Paper • 2508.01059 • Published Aug 1, 2025 • 34
Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens

Paper • 2508.01191 • Published Aug 2, 2025 • 240
On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification

Paper • 2508.05629 • Published Aug 7, 2025 • 189
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models

Paper • 2508.06471 • Published Aug 8, 2025 • 211
UI-Venus Technical Report: Building High-performance UI Agents with RFT

Paper • 2508.10833 • Published Aug 14, 2025 • 46
DINOv3

Paper • 2508.10104 • Published Aug 13, 2025 • 308
SSRL: Self-Search Reinforcement Learning

Paper • 2508.10874 • Published Aug 14, 2025 • 97
Thyme: Think Beyond Images

Paper • 2508.11630 • Published Aug 15, 2025 • 81
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

Paper • 2509.02547 • Published Sep 2, 2025 • 238
Self-Distilled Agentic Reinforcement Learning

Paper • 2605.15155 • Published 8 days ago • 106