File size: 4,456 Bytes
7af305a
140ab17
 
7c92d9a
ae23ba0
7c92d9a
c01335b
7af305a
 
 
 
 
 
 
 
 
401f7d4
 
7af305a
 
 
 
 
 
 
 
 
d41856c
7af305a
7b06bb7
d41856c
7af305a
 
 
2c10107
 
ef847d6
29b8e13
 
2c10107
7af305a
 
 
 
 
521f97a
a30bc61
 
288dae2
7af305a
a30bc61
 
 
 
 
 
 
 
 
 
 
 
 
1b88b8b
7af305a
a30bc61
7af305a
a30bc61
 
 
5dca826
ba99187
 
a30bc61
 
5dca826
e53d69d
 
5dca826
e53d69d
 
 
 
 
 
5dca826
e53d69d
 
 
 
 
 
 
 
 
 
5dca826
7af305a
 
 
 
29b8e13
7af305a
979d483
264e198
7af305a
 
 
 
 
 
 
 
 
 
 
 
820f01f
7af305a
abe7f23
7af305a
 
 
140ab17
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
---
language:
- multilingual
license: other
license_name: kwaipilot-license
license_link: LICENSE
library_name: transformers
---
<div align="center">
  <img src="https://raw.githubusercontent.com/Anditty/OASIS/refs/heads/main/Group.svg" width="60%" alt="Kwaipilot" />
</div>

<hr>

# Kwaipilot **KwaiCoder-AutoThink-preview** (AutoThink Preview)

**Update (2025-06-10):** The model has been updated to the latest version with improved performance and stability.

**KwaiCoder-AutoThink-preview** is the first public *AutoThink* LLM released by the **Kwaipilot** team at Kuaishou.  
The model merges *thinking* and *non‑thinking* abilities into a single checkpoint and **dynamically adjusts its reasoning depth** based on the input’s difficulty.

***

## ✨ Key Highlights

| Feature | What it means | Benefit |
|---------|---------------|---------|
| **Auto Think** | Diverse *pre‑think* data teaches the model to predict task difficulty | Better choice of when to think |
| **Step‑SRPO** | Token‑wise GRPO variant with process‑level rewards | More stable RL, higher “think” & “no‑think” accuracy |
| **Agentic Data** | Automated cot cold start data generation | Stronger inference models before reinforcement learning |
| **KD + MTP** | 1 teacher → many‑token prediction distillation | <1⁄30 pre‑train cost |

***

## Evaluation Results

![image/png](https://cdn-uploads.huggingface.co/production/uploads/64f597588b6d053c709debd9/VRv2I4fWCQfmerQFtwQNn.png)

***

## 🔧 Quick Start

```python
from transformers import AutoTokenizer, AutoModelForCausalLM

model_name = "Kwaipilot/KwaiCoder-AutoThink-preview"

# load the tokenizer and the model
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)

# prepare the model input
prompt = "Give me a short introduction to large language model."
messages = [
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

# conduct text completion
generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=32768,
    temperature=0.6,
    top_p=0.9,
)
output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist() 
content = tokenizer.decode(output_ids, skip_special_tokens=True).strip("\n")
print("prompt:\n", prompt)
print("content:\n", content)
"""
prompt:
Give me a short introduction to large language model.
content:
<judge>
This is a definitional query seeking a basic explanation, which can be answered with straightforward factual recall or a concise summary. Requires think-off mode.
</judge>

<think off>
Large Language Models (LLMs) are advanced artificial intelligence systems designed to understand and generate human-like text. They are trained on vast amounts of data to learn grammar, facts, reasoning, and context. Key features include:  

- **Scale**: Billions (or even trillions) of parameters, enabling complex pattern recognition.  
- **Versatility**: Can perform tasks like answering questions, writing code, summarizing text, and more.  
- **Adaptability**: Fine-tuned for specific uses (e.g., customer support, creative writing).  

Examples include OpenAI's GPT, Google's Gemini, and Meta's Llama. While powerful, LLMs may occasionally hallucinate or rely on outdated information. They’re transforming industries by automating text-based tasks and enhancing human productivity.  

Would you like a deeper dive into any aspect?
"""
```

***

## 🏗️ TODO

- Technical reports will soon be released.
- The released version of the model with improved performance will be comming soon.

***

## 🚦 Limitations & Notes

- The preview checkpoint may occasionally over‑ or under‑think outside training distribution.  
- Use responsibly; verify factual outputs, especially when disabling thought traces.

***

## 📜 License

This repository is licensed under the **MIT License**. The use of KwaiCoder-AutoThink models is subject to the Model License. KwaiCoder-AutoThink models supports commercial use.

See the [LICENSE-MODEL](https://huggingface.co/Kwaipilot/KwaiCoder-AutoThink-preview/blob/main/LICENSE) for more details.
***


*This is a **preview** release. We will publish the full training recipe, data, and benchmarks soon.*