Joint Extraction of Entities and Relations Based on a Novel Decomposition Strategy Paper • 1909.04273 • Published Sep 10, 2019
ERNIE-Layout: Layout Knowledge Enhanced Pre-training for Visually-rich Document Understanding Paper • 2210.06155 • Published Oct 12, 2022
Upcycling Instruction Tuning from Dense to Mixture-of-Experts via Parameter Merging Paper • 2410.01610 • Published Oct 2, 2024
Inner Thinking Transformer: Leveraging Dynamic Depth Scaling to Foster Adaptive Internal Thinking Paper • 2502.13842 • Published Feb 19, 2025
Debiasing Multimodal Large Language Models via Noise-Aware Preference Optimization Paper • 2503.17928 • Published Mar 23, 2025 • 2
COOPER: A Unified Model for Cooperative Perception and Reasoning in Spatial Intelligence Paper • 2512.04563 • Published Dec 4, 2025 • 16
DHA: Learning Decoupled-Head Attention from Transformer Checkpoints via Adaptive Heads Fusion Paper • 2406.06567 • Published Jun 3, 2024
NACL: A General and Effective KV Cache Eviction Framework for LLMs at Inference Time Paper • 2408.03675 • Published Aug 7, 2024
Mixture of Universal Experts: Scaling Virtual Width via Depth-Width Transformation Paper • 2603.04971 • Published Mar 5 • 3
CLEAR: Unlocking Generative Potential for Degraded Image Understanding in Unified Multimodal Models Paper • 2604.04780 • Published Apr 6 • 10
VideoAR: Autoregressive Video Generation via Next-Frame & Scale Prediction Paper • 2601.05966 • Published Jan 9 • 23
COOPER: A Unified Model for Cooperative Perception and Reasoning in Spatial Intelligence Paper • 2512.04563 • Published Dec 4, 2025 • 16