DimensionSTP commited on
Commit
574836f
Β·
1 Parent(s): dcb5422

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +137 -0
README.md ADDED
@@ -0,0 +1,137 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - ko
5
+ - en
6
+ tags:
7
+ - korean
8
+ - reasoning
9
+ - instruction-tuning
10
+ - fine-tuning
11
+ - qwq
12
+ - sft
13
+ ---
14
+
15
+ # 🧠 QwQ-32B-Ko-Reasoning
16
+
17
+ > A large-scale Korean reasoning model fine-tuned from **Qwen/QwQ-32B**, designed to excel in logical and multi-hop reasoning tasks in Korean.
18
+
19
+ ---
20
+
21
+ ## πŸ“Œ Overview
22
+
23
+ **QwQ-32B-Ko-Reasoning** is a fine-tuned version of [Qwen/QwQ-32B](https://huggingface.co/Qwen/QwQ-32B), specifically optimized for **logical reasoning in Korean**. This model is part of a broader research initiative to explore:
24
+
25
+ - The **transition from multilingual reasoning LLMs** to **Korean-specialized reasoning models**
26
+ - The enhancement of **non-reasoning Korean language models** into **reasoning-capable variants**
27
+ - The development of open-access models that rival proprietary alternatives in complex reasoning tasks
28
+
29
+ This model was fine-tuned using a large-scale Korean-English instruction dataset containing diverse multi-hop questions, symbolic logic tasks, and human-crafted reasoning steps.
30
+
31
+ ---
32
+
33
+ ## πŸ§ͺ Benchmark Results
34
+
35
+ > - πŸ“Š All benchmarks were measured using the **0-shot CoT (Chain-of-Thought)** method.
36
+ > - πŸ“Š The **Score** represents either the **accuracy (%)** of correct answers or a rating on a **1-10 scale** from a judge model.
37
+ > - πŸ“Š **LLM-as-a-judge** benchmarks were evaluated using **GPT-4o (2024-08-01-preview)**.
38
+
39
+ | **Benchmark** | **Score** |
40
+ |------------------|---------------|
41
+ | GPQA diamond | 71.7 |
42
+ | GSM8K | 74.6 |
43
+ | HAERAE | 83.0 |
44
+ | KSM | 83.1 |
45
+ | LogicKor | 8.93 |
46
+ | Math500 | 85.3 |
47
+ | MT-Bench | 8.36 |
48
+ | MT-Bench(Ko) | 8.02 |
49
+
50
+ ---
51
+
52
+ ## πŸ§‘β€πŸ’» Usage
53
+
54
+ Install Transformers >= 4.50:
55
+
56
+ ```bash
57
+ pip install -U transformers
58
+ ```
59
+
60
+ Basic example:
61
+
62
+ ```python
63
+ from transformers import AutoModelForCausalLM, AutoTokenizer
64
+
65
+ model_name = "DimensionSTP/QwQ-32B-Ko-Reasoning"
66
+
67
+ model = AutoModelForCausalLM.from_pretrained(
68
+ model_name,
69
+ torch_dtype="auto",
70
+ device_map="auto"
71
+ )
72
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
73
+
74
+ prompt = "μ„œμšΈκ³Ό λΆ€μ‚° 쀑 μ–΄λ””κ°€ 더 컀?"
75
+ messages = [
76
+ {"role": "user", "content": prompt}
77
+ ]
78
+ text = tokenizer.apply_chat_template(
79
+ messages,
80
+ tokenize=False,
81
+ add_generation_prompt=True
82
+ )
83
+
84
+ model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
85
+
86
+ generated_ids = model.generate(
87
+ **model_inputs,
88
+ max_new_tokens=32768
89
+ )
90
+ generated_ids = [
91
+ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
92
+ ]
93
+
94
+ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
95
+ print(response)
96
+ ```
97
+
98
+ ---
99
+
100
+ ## 🧠 Base Model: Qwen/QwQ-32B
101
+
102
+ The base model, [Qwen/QwQ-32B](https://huggingface.co/Qwen/QwQ-32B), is a CoT LLM developed by the Alibaba Qwen team.
103
+ For more technical details, refer to the [Qwen2.5 Technical Report](https://arxiv.org/pdf/2412.15115).
104
+
105
+ ---
106
+
107
+ ## 🧱 Model Architecture
108
+
109
+ | Property | Value |
110
+ |------------------|------------------------|
111
+ | Architecture | Qwen2ForCausalLM |
112
+ | Parameters | 32B |
113
+ | Context Length | 131,072 tokens |
114
+ | Tokenizer | QwenTokenizer (BPE) |
115
+
116
+ ---
117
+
118
+ ## πŸ“… Release Date
119
+
120
+ **Mar 2025**
121
+ This model was released in March 2025 as part of the **Ko-Reasoning Series**, which focuses on pushing the boundaries of open-source reasoning in Korean using modern LLMs.
122
+
123
+ ---
124
+
125
+ ## πŸ“¬ Contact
126
+
127
+ For questions, collaborations, or deployment inquiries, please contact:
128
+
129
+ - πŸ€– Hugging Face: [https://huggingface.co/DimensionSTP](https://huggingface.co/DimensionSTP)
130
+ - βœ‰οΈ Email: [ddang8jh@gmail.com]
131
+
132
+ ---
133
+
134
+ ## πŸ“¦ Available Checkpoints
135
+
136
+ - βœ… `main`: Final stable version from the `last` branch
137
+ - βœ… All training artifacts available (tokenizer, config, model weights)