Instructions to use Kwaipilot/KwaiCoder-AutoThink-preview with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Kwaipilot/KwaiCoder-AutoThink-preview with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("Kwaipilot/KwaiCoder-AutoThink-preview", dtype="auto") - Notebooks
- Google Colab
- Kaggle
| language: | |
| - multilingual | |
| license: other | |
| license_name: kwaipilot-license | |
| license_link: LICENSE | |
| library_name: transformers | |
| <div align="center"> | |
| <img src="https://raw.githubusercontent.com/Anditty/OASIS/refs/heads/main/Group.svg" width="60%" alt="Kwaipilot" /> | |
| </div> | |
| <hr> | |
| # Kwaipilot **KwaiCoder-AutoThink-preview** (AutoThink Preview) | |
| **Update (2025-06-10):** The model has been updated to the latest version with improved performance and stability. | |
| **KwaiCoder-AutoThink-preview** is the first public *AutoThink* LLM released by the **Kwaipilot** team at Kuaishou. | |
| The model merges *thinking* and *non‑thinking* abilities into a single checkpoint and **dynamically adjusts its reasoning depth** based on the input’s difficulty. | |
| *** | |
| ## ✨ Key Highlights | |
| | Feature | What it means | Benefit | | |
| |---------|---------------|---------| | |
| | **Auto Think** | Diverse *pre‑think* data teaches the model to predict task difficulty | Better choice of when to think | | |
| | **Step‑SRPO** | Token‑wise GRPO variant with process‑level rewards | More stable RL, higher “think” & “no‑think” accuracy | | |
| | **Agentic Data** | Automated cot cold start data generation | Stronger inference models before reinforcement learning | | |
| | **KD + MTP** | 1 teacher → many‑token prediction distillation | <1⁄30 pre‑train cost | | |
| *** | |
| ## Evaluation Results | |
|  | |
| *** | |
| ## 🔧 Quick Start | |
| ```python | |
| from transformers import AutoTokenizer, AutoModelForCausalLM | |
| model_name = "Kwaipilot/KwaiCoder-AutoThink-preview" | |
| # load the tokenizer and the model | |
| tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True) | |
| model = AutoModelForCausalLM.from_pretrained( | |
| model_name, | |
| torch_dtype="auto", | |
| device_map="auto" | |
| ) | |
| # prepare the model input | |
| prompt = "Give me a short introduction to large language model." | |
| messages = [ | |
| {"role": "user", "content": prompt} | |
| ] | |
| text = tokenizer.apply_chat_template( | |
| messages, | |
| tokenize=False, | |
| add_generation_prompt=True | |
| ) | |
| model_inputs = tokenizer([text], return_tensors="pt").to(model.device) | |
| # conduct text completion | |
| generated_ids = model.generate( | |
| **model_inputs, | |
| max_new_tokens=32768, | |
| temperature=0.6, | |
| top_p=0.9, | |
| ) | |
| output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist() | |
| content = tokenizer.decode(output_ids, skip_special_tokens=True).strip("\n") | |
| print("prompt:\n", prompt) | |
| print("content:\n", content) | |
| """ | |
| prompt: | |
| Give me a short introduction to large language model. | |
| content: | |
| <judge> | |
| This is a definitional query seeking a basic explanation, which can be answered with straightforward factual recall or a concise summary. Requires think-off mode. | |
| </judge> | |
| <think off> | |
| Large Language Models (LLMs) are advanced artificial intelligence systems designed to understand and generate human-like text. They are trained on vast amounts of data to learn grammar, facts, reasoning, and context. Key features include: | |
| - **Scale**: Billions (or even trillions) of parameters, enabling complex pattern recognition. | |
| - **Versatility**: Can perform tasks like answering questions, writing code, summarizing text, and more. | |
| - **Adaptability**: Fine-tuned for specific uses (e.g., customer support, creative writing). | |
| Examples include OpenAI's GPT, Google's Gemini, and Meta's Llama. While powerful, LLMs may occasionally hallucinate or rely on outdated information. They’re transforming industries by automating text-based tasks and enhancing human productivity. | |
| Would you like a deeper dive into any aspect? | |
| """ | |
| ``` | |
| *** | |
| ## 🏗️ TODO | |
| - Technical reports will soon be released. | |
| - The released version of the model with improved performance will be comming soon. | |
| *** | |
| ## 🚦 Limitations & Notes | |
| - The preview checkpoint may occasionally over‑ or under‑think outside training distribution. | |
| - Use responsibly; verify factual outputs, especially when disabling thought traces. | |
| *** | |
| ## 📜 License | |
| This repository is licensed under the **MIT License**. The use of KwaiCoder-AutoThink models is subject to the Model License. KwaiCoder-AutoThink models supports commercial use. | |
| See the [LICENSE-MODEL](https://huggingface.co/Kwaipilot/KwaiCoder-AutoThink-preview/blob/main/LICENSE) for more details. | |
| *** | |
| *This is a **preview** release. We will publish the full training recipe, data, and benchmarks soon.* |