AWAXIS-Think-27b / README.md
Anserwise's picture
Update transformers requirement to >=5.5.4 to fix qwen3_5_text loading error
dff8872 verified
metadata
language:
  - ko
  - en
  - ja
  - zh
  - multilingual
license: apache-2.0
tags:
  - qwen3.5
  - korean
  - reasoning
  - thinking
  - sft
  - k-ai
base_model:
  - FINAL-Bench/Darwin-27B-Opus
pipeline_tag: text-generation
library_name: transformers

AWAXIS-Think-27b

FINAL-Bench/Darwin-27B-Opus ๊ธฐ๋ฐ˜, ํ•œ๊ตญ์–ด ํŠนํ™” ๊ณ ํ’ˆ์งˆ SFT๋ฅผ ์ˆ˜ํ–‰ํ•œ ์ถ”๋ก  ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.

โš ๏ธ Requirements / Loading ์ฃผ์˜ ์ด ๋ชจ๋ธ์€ model_type: qwen3_5_text (Qwen3.5 ํ•˜์ด๋ธŒ๋ฆฌ๋“œ ์•„ํ‚คํ…์ฒ˜)๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. transformers >= 5.5.4 ์ด์ƒ ์—์„œ๋งŒ ์ •์ƒ ๋กœ๋“œ๋ฉ๋‹ˆ๋‹ค.

pip install --upgrade "transformers>=5.5.4"
# ๋˜๋Š” ์ตœ์‹  ๊ฐœ๋ฐœํŒ
pip install "transformers @ git+https://github.com/huggingface/transformers.git@main"

๊ตฌ๋ฒ„์ „ transformers์—์„œ ๋‚˜ํƒ€๋‚˜๋Š” model_type 'qwen3_5_text'๋ฅผ ์ธ์‹ํ•˜์ง€ ๋ชปํ•จ ์˜ค๋ฅ˜๋Š” ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ๋ฏธ์—…๋ฐ์ดํŠธ๋กœ ์ธํ•œ ๊ฒƒ์ด๋ฉฐ, ์œ„ ๋ช…๋ น์œผ๋กœ ํ•ด๊ฒฐ๋ฉ๋‹ˆ๋‹ค.

Method

  • Base Model: Darwin-27B-Opus (Qwen3.5-27B family)
  • Korean SFT: ํ•œ๊ตญ์–ด ๋ฌธํ™”, ์—ญ์‚ฌ, ๋ฒ•๋ฅ , ๊ฒฝ์ œ, ์‚ฌํšŒ, ์ง€๋ฆฌ ๋“ฑ ํ•œ๊ตญ ํŠนํ™” ์ง€์‹ ์ค‘์‹ฌ์˜ ๊ณ ํ’ˆ์งˆ instruction ๋ฐ์ดํ„ฐ๋กœ Supervised Fine-Tuning ์ˆ˜ํ–‰
  • Thinking Mode: <think> ํƒœ๊ทธ๋ฅผ ํ†ตํ•œ Chain-of-Thought ๋‹จ๊ณ„์  ์ถ”๋ก  ์ง€์›

Benchmark

Benchmark Score
CLIcK (Korean Cultural & Linguistic Knowledge) 81.0%
KMMLU-Pro (Korean MMLU Professional) 74.0%

Model Specifications

Property Value
Architecture Qwen3.5 Hybrid (GatedDeltaNet + Attention, 64 layers)
Parameters ~27B
Hidden Size 5120
Intermediate Size 16384
Context Length 262,144 tokens
Precision BF16
Vocab Size 248,320
Thinking Supported (<think> tags)
License Apache 2.0

Usage

Requirements: transformers >= 5.5.4

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model = AutoModelForCausalLM.from_pretrained(
    "Anserwise/AWAXIS-Think-27b",
    torch_dtype=torch.bfloat16,
    device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained("Anserwise/AWAXIS-Think-27b")

messages = [{"role": "user", "content": "์กฐ์„ ์‹œ๋Œ€์˜ ๊ณผ๊ฑฐ์ œ๋„์— ๋Œ€ํ•ด ์„ค๋ช…ํ•ด์ฃผ์„ธ์š”."}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=1024, do_sample=False)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:], skip_special_tokens=True))

vLLM

vllm serve Anserwise/AWAXIS-Think-27b \
    --enforce-eager \
    --max-model-len 32768 \
    --dtype bfloat16

Features

  • Darwin-27B-Opus์˜ ๊ฐ•๋ ฅํ•œ ์ถ”๋ก  ๋Šฅ๋ ฅ ๊ณ„์Šน
  • ํ•œ๊ตญ์–ด ๋ฌธํ™”, ์—ญ์‚ฌ, ๋ฒ•๋ฅ , ๊ฒฝ์ œ, ์‚ฌํšŒ ๋“ฑ ํ•œ๊ตญ ํŠนํ™” ์ง€์‹ ๊ฐ•ํ™”
  • Thinking mode๋ฅผ ํ†ตํ•œ ๋‹จ๊ณ„์  ์‚ฌ๊ณ  ์ถ”๋ก 
  • ๋‹ค๊ตญ์–ด ์ง€์› (ํ•œ๊ตญ์–ด, ์˜์–ด, ์ผ๋ณธ์–ด, ์ค‘๊ตญ์–ด)
  • 262K ์ปจํ…์ŠคํŠธ ๊ธธ์ด ์ง€์›

Training

Item Details
Base Model FINAL-Bench/Darwin-27B-Opus
Method Korean-specialized Supervised Fine-Tuning
Data ํ•œ๊ตญ์–ด ๋ฌธํ™”ยท์ง€์‹ ์ค‘์‹ฌ ๊ณ ํ’ˆ์งˆ instruction ๋ฐ์ดํ„ฐ
Developer Anserwise

Acknowledgements