File size: 6,281 Bytes
aaf9908 698cdd0 aaf9908 a781786 aaf9908 a781786 aaf9908 a781786 aaf9908 a781786 aaf9908 a781786 aaf9908 698cdd0 aaf9908 698cdd0 aaf9908 698cdd0 aaf9908 a781786 698cdd0 aaf9908 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 | ---
license: apache-2.0
language:
- ko
- en
library_name: transformers
tags:
- korean
- reasoning
- darwin
- evolutionary-merge
base_model:
- FINAL-Bench/Darwin-27B-Opus
- NewenAI/QuettaLLMs-27B-Koreasoner-V3
---
# Warecube-KO-27B
νκ΅μ΄ reasoning λͺ¨λΈ β Darwin μ§νμ λ¨Έμ§ κΈ°λ°.
---
## 𧬠Darwin μ§ν 컨μ
λ³Έ λͺ¨λΈμ **Darwin V7 μ§νμ λͺ¨λΈ λ¨Έμ§(Evolutionary Model Merge)**
ν¨λ¬λ€μμΌλ‘ μ μλμμ΅λλ€.
```
μμ° μ§ν Darwin λ¨Έμ§
βββββββββ βββββββββββ
μ μ μ κ΅μ°¨ (crossover) β κ°μ€μΉ λͺ¨λλ³ λΉμ¨ κ²°ν©
μμ° μ ν (selection) β μ ν©λ νκ° ν μ΅μ νμ μ λ³
μΈλ μ§ν (generations) β λ€μΈλ λ¨Έμ§Β·μ μ λ°λ³΅
μ μ μμ‘΄ β K-AI λλ©μΈ μ°μ μμλ§ λ³΄μ‘΄
```
λΆλͺ¨μ λ₯λ ₯μ΄ μμ λͺ¨λΈλ‘ **μ μ μ μΌλ‘ κ³μΉ**λλ©°,
μΈλλ₯Ό κ±°μ³ νκ΅μ΄Β·μΆλ‘ Β·λ¬Έν μ§λ₯μ΄ μ§νν©λλ€.
---
## ποΈ κ°λ¬Έ κ³λ³΄
```
ββββββββββββββββββββββββββββββββββββββββββββ
β μ¦μ‘°λΆ (Great-Grandfather) β
β Qwen-3.5-27B β
β - λ©ν°λͺ¨λ¬ 28B λ² μ΄μ€ β
ββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ Darwin V7 μ§ν λ¨Έμ§
ββββββββββββββββββββββββββββββββββββββββββββ
β μ‘°λΆ (Grandfather) β
β FINAL-Bench/Darwin-27B-Opus β
β - Darwin V7 μ§νμ μ μ β
β - GPQA 88.4% reasoning β
β - <think> νΈλ μ΄μ€ ν¨ν΄ β
ββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ νκ΅μ΄ νΉν μ§ν
ββββββββββββββββββββββββββββββββββββββββββββ
β μλΉ (Father) β
β Darwin family Korean μ§κ³ β
β β
β - Darwin-27B-Opusμ νκ΅μ΄ νΉν νμ β
β - reasoning DNA 보쑴 β
β - <think> ν¨ν΄ μ μ§ β
ββββββββββββββββββββββββββββββββββββββββββββ
β
ΓΓ λ€μ κ΅λ°° ΓΓ
β
ββββββββββββββββββββββββββββββββββββββββββββ
β μλ§ (Mother) β
β NewenAI/QuettaLLMs-27B-Koreasoner-V3 β
β β
β - νκ΅μ΄ SOTA λͺ¨λΈ β
β - K-AI Leaderboard 1μ (avg 0.560) β
β - νκ΅μ΄ λλ©μΈ SFT μ μ β
β - Apache 2.0 β
ββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ Darwin μ§νμ λ¨Έμ§ + νκ΅μ΄ μ μ
ββββββββββββββββββββββββββββββββββββββββββββ
β μμ (Child) β λ³Έ λͺ¨λΈ β
β Warecube/Warecube-KO-27B β
β β
β β¦ μλΉ μ reasoning DNA κ³μΉ β
β β¦ μλ§μ νκ΅μ΄ ννΒ·μ§μ κ³μΉ β
β β¦ <think> μΆλ‘ νΈλ μ΄μ€ 보쑴 β
β β¦ K-AI λλ©μΈ μ ν©λ μ§ν β
ββββββββββββββββββββββββββββββββββββββββββββ
```
---
## π μ§ν λ¨κ³
| Stage | κ°λ΅ |
|:---|:---|
| **1. κ΅λ°° (Crossover)** | μΉκ°Β·μΈκ° κ°μ€μΉλ₯Ό λͺ¨λλ³ λΉμ¨λ‘ μ§ν λ¨Έμ§ |
| **2. μ ν (Selection)** | νκ΅μ΄ λλ©μΈ μ ν©λ νκ°λ‘ μ°μ νμ μ λ³ |
| **3. μ μ (Refinement)** | νκ΅μ΄ instruction λ°μ΄ν°λ‘ μΆκ° μ§ν |
| **4. μ μ (Adaptation)** | K-AI Leaderboard Docker νΈν νμμΌλ‘ μ λΉ |
---
## π― μ¬μ©λ²
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_id = "Warecube/Warecube-KO-27B"
tokenizer = AutoTokenizer.from_pretrained(
model_id, trust_remote_code=True
)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="auto",
trust_remote_code=True,
)
prompt = "νκ΅μ μΆμμ λν΄ μ€λͺ
ν΄μ£ΌμΈμ."
messages = [{"role": "user", "content": prompt}]
inputs = tokenizer.apply_chat_template(
messages, return_tensors="pt", add_generation_prompt=True
)
out = model.generate(
inputs.to(model.device),
max_new_tokens=512,
do_sample=False,
)
print(tokenizer.decode(out[0], skip_special_tokens=False))
```
---
## π οΈ μ¬μ
- νλΌλ―Έν°: 27B (text)
- μμν: bf16
- 컨ν
μ€νΈ: 8K (νμ₯ κ°λ₯)
- μΈμ΄: νκ΅μ΄ + μμ΄
- μΆλ‘ : `<think>` reasoning trace
- License: Apache 2.0
---
## π νκ°
νκ΅μ΄ κ³΅κ° 10 λ°μ΄ν°μ
, 100λ¬Έμ Γ 1 seed.
| Dataset | Score |
|:---|---:|
| CLIcK | **87%** |
| KMMLU History | **50%** |
| KMMLU Law | **29%** |
| KMMLU Health | 78% |
| HAERAE General | 58% |
| HAERAE History | 86% |
| HAERAE Linguistics | 89% |
| KoBEST Hellaswag | 89% |
| KoBEST COPA | **100%** |
| KoBEST BoolQ | 97% |
| **Macro Avg** | **76.3%** |
---
## π€ μΆμ²
- μ‘°λΆ: [FINAL-Bench/Darwin-27B-Opus](https://huggingface.co/FINAL-Bench/Darwin-27B-Opus)
- μλ§: [NewenAI/QuettaLLMs-27B-Koreasoner-V3](https://huggingface.co/NewenAI/QuettaLLMs-27B-Koreasoner-V3)
- κ°λ¬Έ: Darwin family (Darwin V7 μ§νμ λ¨Έμ§ μ리μ¦)
|