File size: 5,984 Bytes
ebb29ac 885d2d4 ebb29ac | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 | ---
license: other
license_name: hyperclovax
license_link: https://huggingface.co/naver-hyperclovax/HyperCLOVAX-SEED-Think-32B/blob/main/LICENSE
library_name: transformers
base_model: naver-hyperclovax/HyperCLOVAX-SEED-Think-32B
tags:
- llama
- text-generation
- korean
- reasoning
language:
- ko
- en
pipeline_tag: text-generation
---
# HyperCLOVAX-SEED-Text-Think-32B
**Extracted text-only LLM from [naver-hyperclovax/HyperCLOVAX-SEED-Think-32B](https://huggingface.co/naver-hyperclovax/HyperCLOVAX-SEED-Think-32B)**
This model contains only the language model component extracted from the original Vision-Language Model (VLM). The vision encoder and multimodal projector have been removed, making it a pure text-to-text model compatible with standard LLaMA inference pipelines.
## Model Details
| Property | Value |
|----------|-------|
| Architecture | LlamaForCausalLM |
| Parameters | ~33B |
| Hidden Size | 5120 |
| Layers | 72 |
| Attention Heads | 40 |
| KV Heads | 8 (GQA) |
| Intermediate Size | 24192 |
| Context Length | 128K |
| Vocab Size | 128,256 |
| Precision | bfloat16 |
| RoPE Theta | 50,000,000 |
## What Was Extracted
The original VLM consists of:
- **Vision Encoder**: Qwen2.5-VL based (~600M params) - **removed**
- **MM Projector**: Multimodal projection layers - **removed**
- **Language Model**: HyperCLOVAX LLM (~33B params) - **extracted** ✓
Only the `model.language_model.*` weights were extracted and remapped to standard LLaMA format.
## Usage
### With Transformers
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "minpeter/HyperCLOVAX-SEED-Text-Think-32B-hf"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype="bfloat16",
device_map="auto"
)
messages = [{"role": "user", "content": "What is the capital of South Korea?"}]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True)
outputs = model.generate(inputs.to(model.device), max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
### With vLLM
```bash
vllm serve minpeter/HyperCLOVAX-SEED-Text-Think-32B-hf \
--dtype bfloat16 \
--tensor-parallel-size 2
```
```python
from openai import OpenAI
client = OpenAI(base_url="http://localhost:8000/v1", api_key="dummy")
response = client.chat.completions.create(
model="minpeter/HyperCLOVAX-SEED-Text-Think-32B-hf",
messages=[{"role": "user", "content": "안녕하세요! 한국어로 대화할 수 있나요?"}]
)
print(response.choices[0].message.content)
```
## Thinking Mode
The model supports a "thinking mode" for complex reasoning tasks. Use the `<|thinking|>` token to trigger extended reasoning:
```python
messages = [
{"role": "user", "content": "Solve this step by step: If x + 2y = 10 and 3x - y = 5, find x and y."}
]
# The model may produce <|thinking|>...</|thinking|> blocks with its reasoning process
```
## Hardware Requirements
- **Minimum**: 2x NVIDIA A100 40GB (with tensor parallelism)
- **Recommended**: 2x NVIDIA A100 80GB or 4x NVIDIA A6000
## Limitations
- This is a **text-only** model. It cannot process images or videos.
- The model inherits any limitations from the original HyperCLOVAX-SEED-Think-32B.
- Optimized primarily for Korean and English.
## License
This model inherits the [HyperCLOVAX license](https://huggingface.co/naver-hyperclovax/HyperCLOVAX-SEED-Think-32B/blob/main/LICENSE) from the original model.
## Citation
If you use this model, please cite the original:
```bibtex
@misc{hyperclovax-seed-think-32b,
title={HyperCLOVA X SEED Think 32B},
author={NAVER Cloud},
year={2025},
url={https://huggingface.co/naver-hyperclovax/HyperCLOVAX-SEED-Think-32B}
}
```
## Reproduce This Extraction
Want to extract the LLM yourself? Use the included [`extract_llm.py`](extract_llm.py) script.
### Prerequisites
```bash
pip install safetensors torch tqdm huggingface_hub
```
### Step 1: Download Original VLM (~66GB)
```bash
huggingface-cli download naver-hyperclovax/HyperCLOVAX-SEED-Think-32B \
--local-dir ./HyperCLOVAX-SEED-Think-32B
```
### Step 2: Run Extraction Script
```bash
# Download the extraction script
wget https://huggingface.co/minpeter/HyperCLOVAX-SEED-Text-Think-32B-hf/resolve/main/extract_llm.py
# Run extraction
python extract_llm.py \
--input ./HyperCLOVAX-SEED-Think-32B \
--output ./HyperCLOVAX-SEED-Text-Think-32B
```
### What the Script Does
1. **Extracts LLM weights**: Filters `model.language_model.*` tensors from the VLM
2. **Remaps keys**: Converts to standard LLaMA format
- `model.language_model.model.*` → `model.*`
- `model.language_model.lm_head.*` → `lm_head.*`
3. **Creates config**: Generates LLaMA-compatible `config.json` from VLM's `text_config`
4. **Copies tokenizer**: Preserves all tokenizer files unchanged
### Output Structure
```
HyperCLOVAX-SEED-Text-Think-32B/
├── config.json # LLaMA config
├── generation_config.json
├── model-00001-of-00013.safetensors # ~5GB shards
├── ...
├── model-00013-of-00013.safetensors
├── model.safetensors.index.json
├── tokenizer.json
├── tokenizer_config.json
├── special_tokens_map.json
├── added_tokens.json
├── vocab.json
├── merges.txt
└── chat_template.jinja
```
### Verify Extraction
```bash
# Quick test with vLLM
vllm serve ./HyperCLOVAX-SEED-Text-Think-32B \
--dtype bfloat16 \
--tensor-parallel-size 2
# In another terminal
curl http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model": "./HyperCLOVAX-SEED-Text-Think-32B", "messages": [{"role": "user", "content": "Hello!"}]}'
```
## Acknowledgments
- Original model by [NAVER Cloud HyperCLOVA X](https://huggingface.co/naver-hyperclovax)
- Extraction performed to enable text-only inference without vision dependencies
|