jonathanhe123 nielsr HF Staff commited on
Commit
441beeb
·
verified ·
1 Parent(s): 0aff694

Link model to paper and improve model card (#1)

Browse files

- Link model to paper and improve model card (543f6c5ecf244fce57e8f4fbc4b7377ecc654e8c)


Co-authored-by: Niels Rogge <nielsr@users.noreply.huggingface.co>

Files changed (1) hide show
  1. README.md +22 -11
README.md CHANGED
@@ -1,32 +1,43 @@
1
  ---
2
- license: apache-2.0
3
- datasets:
4
- - ChilleD/MultiArith
5
  base_model:
6
  - optimum/mistral-1.1b-testing
 
 
 
7
  pipeline_tag: text-generation
 
8
  tags:
9
  - model_hub_mixin
10
  - pytorch_model_hub_mixin
 
 
11
  ---
12
 
13
  # SemCoT: Accelerating Chain-of-Thought Reasoning through Semantically-Aligned Implicit Tokens
 
14
  ## 🚀 Overview
15
- SemCoT is a framework that improves the efficiency of Chain-of-Thought (CoT) reasoning by encoding reasoning steps inside hidden representations instead of generating long textual explanations. This implicit reasoning greatly speeds up inference while keeping performance high. Specifically, we take optimum/mistral-1.1b-testing as the base model and fine tune using the SemCoT framework on ChilleD/MultiArith dataset. See [our paper](https://arxiv.org/abs/2510.24940) and [our code](https://github.com/YinhanHe123/SemCoT/). Please reference the code for how to load and use the model.
16
 
17
- ## 🎯 Key Features
18
- 🗣️ Semantic Alignment: Uses a contrastively trained sentence transformer to ensure that implicit reasoning remains semantically consistent with human-readable CoT explanations.
19
 
20
- Efficiency Optimization: Introduces a lightweight implicit reasoning generator, fine-tuned via knowledge distillation, to reduce token generation time and enhance inference speed.
 
 
 
 
 
 
 
21
 
22
- 🧩 Joint Optimization: SemCoT simultaneously optimizes for reasoning speed and semantic alignment.
 
23
 
24
  ## Citation
25
- ```
26
- @article{he2025semcot,
27
  title={SemCoT: Accelerating Chain-of-Thought Reasoning through Semantically-Aligned Implicit Tokens},
28
  author={He, Yinhan and Zheng, Wendy and Zhu, Yaochen and Zheng, Zaiyi and Su, Lin and Vasudevan, Sriram and Guo, Qi and Hong, Liangjie and Li, Jundong},
29
- journal={arXiv preprint arXiv:2510.24940},
30
  year={2025}
31
  }
32
  ```
 
1
  ---
 
 
 
2
  base_model:
3
  - optimum/mistral-1.1b-testing
4
+ datasets:
5
+ - ChilleD/MultiArith
6
+ license: apache-2.0
7
  pipeline_tag: text-generation
8
+ arxiv: 2510.24940
9
  tags:
10
  - model_hub_mixin
11
  - pytorch_model_hub_mixin
12
+ - chain-of-thought
13
+ - implicit-reasoning
14
  ---
15
 
16
  # SemCoT: Accelerating Chain-of-Thought Reasoning through Semantically-Aligned Implicit Tokens
17
+
18
  ## 🚀 Overview
19
+ **SemCoT** is a framework designed to accelerate Chain-of-Thought (CoT) reasoning in Large Language Models (LLMs) by replacing verbose explicit reasoning with compact, semantically-aligned implicit tokens. Instead of generating long textual explanations, SemCoT encodes reasoning steps within hidden representations (implicit reasoning), which significantly speeds up inference while maintaining high performance.
20
 
21
+ This specific checkpoint is a fine-tuned version of `optimum/mistral-1.1b-testing` using the SemCoT framework on the `ChilleD/MultiArith` dataset.
 
22
 
23
+ - **Paper:** [SemCoT: Accelerating Chain-of-Thought Reasoning through Semantically-Aligned Implicit Tokens](https://huggingface.co/papers/2510.24940)
24
+ - **Authors:** Yinhan He, Wendy Zheng ([@wendyz123](https://huggingface.co/wendyz123)), Yaochen Zhu ([@yaochenzhu](https://huggingface.co/yaochenzhu)), Zaiyi Zheng, Lin Su, Sriram Vasudevan ([@sriramvasudevan](https://huggingface.co/sriramvasudevan)), Qi Guo, Liangjie Hong, Jundong Li.
25
+ - **Code:** [Official GitHub Repository](https://github.com/YinhanHe123/SemCoT)
26
+
27
+ ## 🎯 Key Features
28
+ - 🗣️ **Semantic Alignment**: Uses a contrastively trained sentence transformer to ensure that implicit reasoning remains semantically consistent with human-readable CoT explanations.
29
+ - ⚡ **Efficiency Optimization**: Introduces a lightweight implicit reasoning generator, fine-tuned via knowledge distillation, to reduce token generation time and enhance inference speed.
30
+ - 🧩 **Joint Optimization**: SemCoT is the first approach that enhances CoT efficiency by jointly optimizing token-level generation speed and preserving semantic alignment with ground-truth reasoning.
31
 
32
+ ## 🛠️ Usage
33
+ Please refer to the [official GitHub repository](https://github.com/YinhanHe123/SemCoT/) for instructions on environment setup, data generation, and how to run the evaluation scripts for this model.
34
 
35
  ## Citation
36
+ ```bibtex
37
+ @inproceedings{he2025semcot,
38
  title={SemCoT: Accelerating Chain-of-Thought Reasoning through Semantically-Aligned Implicit Tokens},
39
  author={He, Yinhan and Zheng, Wendy and Zhu, Yaochen and Zheng, Zaiyi and Su, Lin and Vasudevan, Sriram and Guo, Qi and Hong, Liangjie and Li, Jundong},
40
+ booktitle={39th Conference on Neural Information Processing Systems (NeurIPS 2025)},
41
  year={2025}
42
  }
43
  ```