What Makes a Good Speech Tokenizer for LLM-Centric Speech Generation? A Systematic Study Paper • 2506.12537 • Published Jun 14, 2025 • 1
Beyond Scaling: Measuring and Predicting the Upper Bound of Knowledge Retention in Language Model Pre-Training Paper • 2502.04066 • Published Feb 6, 2025
TransferTOD: A Generalizable Chinese Multi-Domain Task-Oriented Dialogue System with Transfer Capabilities Paper • 2407.21693 • Published Jul 31, 2024
Muse: Towards Reproducible Long-Form Song Generation with Fine-Grained Style Control Paper • 2601.03973 • Published 20 days ago • 2
The Role of Entropy in Visual Grounding: Analysis and Optimization Paper • 2512.06726 • Published Dec 7, 2025 • 1
Critique-RL: Training Language Models for Critiquing through Two-Stage Reinforcement Learning Paper • 2510.24320 • Published Oct 28, 2025 • 21
Analyzing the Effects of Supervised Fine-Tuning on Model Knowledge from Token and Parameter Levels Paper • 2509.16596 • Published Sep 20, 2025 • 14
AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning Paper • 2509.08755 • Published Sep 10, 2025 • 57
CRITICTOOL: Evaluating Self-Critique Capabilities of Large Language Models in Tool-Calling Error Scenarios Paper • 2506.13977 • Published Jun 11, 2025 • 10
Feedback-Driven Tool-Use Improvements in Large Language Models via Automated Build Environments Paper • 2508.08791 • Published Aug 12, 2025 • 16
Measuring Data Diversity for Instruction Tuning: A Systematic Analysis and A Reliable Metric Paper • 2502.17184 • Published Feb 24, 2025 • 1
A Multi-Dimensional Constraint Framework for Evaluating and Improving Instruction Following in Large Language Models Paper • 2505.07591 • Published May 12, 2025 • 11
A Comprehensive Capability Analysis of GPT-3 and GPT-3.5 Series Models Paper • 2303.10420 • Published Mar 18, 2023 • 1
Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training Paper • 2501.11425 • Published Jan 20, 2025 • 109
ToolHop: A Query-Driven Benchmark for Evaluating Large Language Models in Multi-Hop Tool Use Paper • 2501.02506 • Published Jan 5, 2025 • 10
InstructUIE: Multi-task Instruction Tuning for Unified Information Extraction Paper • 2304.08085 • Published Apr 17, 2023
ToolEyes: Fine-Grained Evaluation for Tool Learning Capabilities of Large Language Models in Real-world Scenarios Paper • 2401.00741 • Published Jan 1, 2024 • 1
Linear Alignment: A Closed-form Solution for Aligning Human Preferences without Tuning and Feedback Paper • 2401.11458 • Published Jan 21, 2024 • 2
LLM can Achieve Self-Regulation via Hyperparameter Aware Generation Paper • 2402.11251 • Published Feb 17, 2024 • 1