AWAXIS-Think-27b / README.md
Anserwise's picture
Update transformers requirement to >=5.5.4 to fix qwen3_5_text loading error
dff8872 verified
---
language:
- ko
- en
- ja
- zh
- multilingual
license: apache-2.0
tags:
- qwen3.5
- korean
- reasoning
- thinking
- sft
- k-ai
base_model:
- FINAL-Bench/Darwin-27B-Opus
pipeline_tag: text-generation
library_name: transformers
---
# AWAXIS-Think-27b
[FINAL-Bench/Darwin-27B-Opus](https://huggingface.co/FINAL-Bench/Darwin-27B-Opus) ๊ธฐ๋ฐ˜, ํ•œ๊ตญ์–ด ํŠนํ™” ๊ณ ํ’ˆ์งˆ SFT๋ฅผ ์ˆ˜ํ–‰ํ•œ ์ถ”๋ก  ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.
> โš ๏ธ **Requirements / Loading ์ฃผ์˜**
> ์ด ๋ชจ๋ธ์€ `model_type: qwen3_5_text` (Qwen3.5 ํ•˜์ด๋ธŒ๋ฆฌ๋“œ ์•„ํ‚คํ…์ฒ˜)๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.
> **`transformers >= 5.5.4` ์ด์ƒ** ์—์„œ๋งŒ ์ •์ƒ ๋กœ๋“œ๋ฉ๋‹ˆ๋‹ค.
>
> ```bash
> pip install --upgrade "transformers>=5.5.4"
> # ๋˜๋Š” ์ตœ์‹  ๊ฐœ๋ฐœํŒ
> pip install "transformers @ git+https://github.com/huggingface/transformers.git@main"
> ```
>
> ๊ตฌ๋ฒ„์ „ transformers์—์„œ ๋‚˜ํƒ€๋‚˜๋Š” `model_type 'qwen3_5_text'๋ฅผ ์ธ์‹ํ•˜์ง€ ๋ชปํ•จ` ์˜ค๋ฅ˜๋Š”
> ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ๋ฏธ์—…๋ฐ์ดํŠธ๋กœ ์ธํ•œ ๊ฒƒ์ด๋ฉฐ, ์œ„ ๋ช…๋ น์œผ๋กœ ํ•ด๊ฒฐ๋ฉ๋‹ˆ๋‹ค.
## Method
- **Base Model**: [Darwin-27B-Opus](https://huggingface.co/FINAL-Bench/Darwin-27B-Opus) (Qwen3.5-27B family)
- **Korean SFT**: ํ•œ๊ตญ์–ด ๋ฌธํ™”, ์—ญ์‚ฌ, ๋ฒ•๋ฅ , ๊ฒฝ์ œ, ์‚ฌํšŒ, ์ง€๋ฆฌ ๋“ฑ ํ•œ๊ตญ ํŠนํ™” ์ง€์‹ ์ค‘์‹ฌ์˜ ๊ณ ํ’ˆ์งˆ instruction ๋ฐ์ดํ„ฐ๋กœ Supervised Fine-Tuning ์ˆ˜ํ–‰
- **Thinking Mode**: `<think>` ํƒœ๊ทธ๋ฅผ ํ†ตํ•œ Chain-of-Thought ๋‹จ๊ณ„์  ์ถ”๋ก  ์ง€์›
## Benchmark
| Benchmark | Score |
|---|---|
| CLIcK (Korean Cultural & Linguistic Knowledge) | **81.0%** |
| KMMLU-Pro (Korean MMLU Professional) | **74.0%** |
## Model Specifications
| Property | Value |
|---|---|
| **Architecture** | Qwen3.5 Hybrid (GatedDeltaNet + Attention, 64 layers) |
| **Parameters** | ~27B |
| **Hidden Size** | 5120 |
| **Intermediate Size** | 16384 |
| **Context Length** | 262,144 tokens |
| **Precision** | BF16 |
| **Vocab Size** | 248,320 |
| **Thinking** | Supported (`<think>` tags) |
| **License** | Apache 2.0 |
## Usage
> **Requirements**: `transformers >= 5.5.4`
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model = AutoModelForCausalLM.from_pretrained(
"Anserwise/AWAXIS-Think-27b",
torch_dtype=torch.bfloat16,
device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained("Anserwise/AWAXIS-Think-27b")
messages = [{"role": "user", "content": "์กฐ์„ ์‹œ๋Œ€์˜ ๊ณผ๊ฑฐ์ œ๋„์— ๋Œ€ํ•ด ์„ค๋ช…ํ•ด์ฃผ์„ธ์š”."}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=1024, do_sample=False)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:], skip_special_tokens=True))
```
### vLLM
```bash
vllm serve Anserwise/AWAXIS-Think-27b \
--enforce-eager \
--max-model-len 32768 \
--dtype bfloat16
```
## Features
- Darwin-27B-Opus์˜ ๊ฐ•๋ ฅํ•œ ์ถ”๋ก  ๋Šฅ๋ ฅ ๊ณ„์Šน
- ํ•œ๊ตญ์–ด ๋ฌธํ™”, ์—ญ์‚ฌ, ๋ฒ•๋ฅ , ๊ฒฝ์ œ, ์‚ฌํšŒ ๋“ฑ ํ•œ๊ตญ ํŠนํ™” ์ง€์‹ ๊ฐ•ํ™”
- Thinking mode๋ฅผ ํ†ตํ•œ ๋‹จ๊ณ„์  ์‚ฌ๊ณ  ์ถ”๋ก 
- ๋‹ค๊ตญ์–ด ์ง€์› (ํ•œ๊ตญ์–ด, ์˜์–ด, ์ผ๋ณธ์–ด, ์ค‘๊ตญ์–ด)
- 262K ์ปจํ…์ŠคํŠธ ๊ธธ์ด ์ง€์›
## Training
| Item | Details |
|---|---|
| **Base Model** | [FINAL-Bench/Darwin-27B-Opus](https://huggingface.co/FINAL-Bench/Darwin-27B-Opus) |
| **Method** | Korean-specialized Supervised Fine-Tuning |
| **Data** | ํ•œ๊ตญ์–ด ๋ฌธํ™”ยท์ง€์‹ ์ค‘์‹ฌ ๊ณ ํ’ˆ์งˆ instruction ๋ฐ์ดํ„ฐ |
| **Developer** | [Anserwise](https://huggingface.co/Anserwise) |
## Acknowledgements
- [FINAL-Bench](https://huggingface.co/FINAL-Bench) โ€” Darwin-27B-Opus base model
- [Qwen Team](https://huggingface.co/Qwen) โ€” Qwen3.5 architecture