SemCoT: Accelerating Chain-of-Thought Reasoning through Semantically-Aligned Implicit Tokens

🚀 Overview

SemCoT is a framework designed to accelerate Chain-of-Thought (CoT) reasoning in Large Language Models (LLMs) by replacing verbose explicit reasoning with compact, semantically-aligned implicit tokens. Instead of generating long textual explanations, SemCoT encodes reasoning steps within hidden representations (implicit reasoning), which significantly speeds up inference while maintaining high performance.

This specific checkpoint is a fine-tuned version of optimum/mistral-1.1b-testing using the SemCoT framework on the ChilleD/MultiArith dataset.

Paper: SemCoT: Accelerating Chain-of-Thought Reasoning through Semantically-Aligned Implicit Tokens
Authors: Yinhan He, Wendy Zheng (@wendyz123), Yaochen Zhu (@yaochenzhu), Zaiyi Zheng, Lin Su, Sriram Vasudevan (@sriramvasudevan), Qi Guo, Liangjie Hong, Jundong Li.
Code: Official GitHub Repository

🎯 Key Features

🗣️ Semantic Alignment: Uses a contrastively trained sentence transformer to ensure that implicit reasoning remains semantically consistent with human-readable CoT explanations.
⚡ Efficiency Optimization: Introduces a lightweight implicit reasoning generator, fine-tuned via knowledge distillation, to reduce token generation time and enhance inference speed.
🧩 Joint Optimization: SemCoT is the first approach that enhances CoT efficiency by jointly optimizing token-level generation speed and preserving semantic alignment with ground-truth reasoning.

🛠️ Usage

Please refer to the official GitHub repository for instructions on environment setup, data generation, and how to run the evaluation scripts for this model.

Citation

@inproceedings{he2025semcot,
  title={SemCoT: Accelerating Chain-of-Thought Reasoning through Semantically-Aligned Implicit Tokens},
  author={He, Yinhan and Zheng, Wendy and Zhu, Yaochen and Zheng, Zaiyi and Su, Lin and Vasudevan, Sriram and Guo, Qi and Hong, Liangjie and Li, Jundong},
  booktitle={39th Conference on Neural Information Processing Systems (NeurIPS 2025)},
  year={2025}
}

Downloads last month: 1

Model tree for jonathanhe123/SemCoT-mistral-1.1b-multiarith

Base model

optimum/mistral-1.1b-testing

Finetuned

(5)

this model

Dataset used to train jonathanhe123/SemCoT-mistral-1.1b-multiarith

Paper for jonathanhe123/SemCoT-mistral-1.1b-multiarith

SemCoT: Accelerating Chain-of-Thought Reasoning through Semantically-Aligned Implicit Tokens

Paper • 2510.24940 • Published Oct 28, 2025 • 18