LACONIC: Length-Aware Constrained Reinforcement Learning for LLM
Paper • 2602.14468 • Published
This repository hosts LACONIC-DeepScaleR-1.5B-2000, a LACONIC-trained variant of agentica-org/DeepScaleR-1.5B-Preview.
LACONIC is a length-aware reinforcement learning method for making LLM responses substantially shorter while preserving task performance. During training, it combines task reward with an adaptive length-based cost so that the model learns to stay near a target response budget. This checkpoint targets a budget of 2000 tokens.
In practice, LACONIC is designed to reduce response length with minimal deployment overhead: the released model uses the usual decoding stack and does not require special inference-time control logic.
agentica-org/DeepScaleR-1.5B-Preview2000Base model
deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B