The Rise and Potential of Large Language Model Based Agents: A Survey Paper • 2309.07864 • Published Sep 14, 2023 • 7
What's Wrong with Your Code Generated by Large Language Models? An Extensive Study Paper • 2407.06153 • Published Jul 8, 2024
Code2Logic: Game-Code-Driven Data Synthesis for Enhancing VLMs General Reasoning Paper • 2505.13886 • Published May 20, 2025 • 6
PFDial: A Structured Dialogue Instruction Fine-tuning Method Based on UML Flowcharts Paper • 2503.06706 • Published Mar 9, 2025
What Makes a Good Speech Tokenizer for LLM-Centric Speech Generation? A Systematic Study Paper • 2506.12537 • Published Jun 14, 2025 • 1
Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm Paper • 2511.04570 • Published Nov 6, 2025 • 211
OpenNovelty: An LLM-powered Agentic System for Verifiable Scholarly Novelty Assessment Paper • 2601.01576 • Published 7 days ago • 9
Beyond Scaling: Measuring and Predicting the Upper Bound of Knowledge Retention in Language Model Pre-Training Paper • 2502.04066 • Published Feb 6, 2025
LLMEval-Fair: A Large-Scale Longitudinal Study on Robust and Fair Evaluation of Large Language Models Paper • 2508.05452 • Published Aug 7, 2025
TransferTOD: A Generalizable Chinese Multi-Domain Task-Oriented Dialogue System with Transfer Capabilities Paper • 2407.21693 • Published Jul 31, 2024
LLMEval: A Preliminary Study on How to Evaluate Large Language Models Paper • 2312.07398 • Published Dec 12, 2023
EvaLearn: Quantifying the Learning Capability and Efficiency of LLMs via Sequential Problem Solving Paper • 2506.02672 • Published Jun 3, 2025
From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on Generalizability, Trustworthiness and Causality through Four Modalities Paper • 2401.15071 • Published Jan 26, 2024 • 37