minpeter
/

HyperCLOVAX-SEED-Text-Think-32B-hf

+---
+license: other
+license_name: hyperclovax
+license_link: https://huggingface.co/naver-hyperclovax/HyperCLOVAX-SEED-Think-32B/blob/main/LICENSE
+library_name: transformers
+base_model: naver-hyperclovax/HyperCLOVAX-SEED-Think-32B
+tags:
+  - llama
+  - text-generation
+  - korean
+  - reasoning
+language:
+  - ko
+  - en
+pipeline_tag: text-generation
+---
+# HyperCLOVAX-SEED-Text-Think-32B
+**Extracted text-only LLM from [naver-hyperclovax/HyperCLOVAX-SEED-Think-32B](https://huggingface.co/naver-hyperclovax/HyperCLOVAX-SEED-Think-32B)**
+This model contains only the language model component extracted from the original Vision-Language Model (VLM). The vision encoder and multimodal projector have been removed, making it a pure text-to-text model compatible with standard LLaMA inference pipelines.
+## Model Details
+| Property | Value |
+|----------|-------|
+| Architecture | LlamaForCausalLM |
+| Parameters | ~33B |
+| Hidden Size | 5120 |
+| Layers | 72 |
+| Attention Heads | 40 |
+| KV Heads | 8 (GQA) |
+| Intermediate Size | 24192 |
+| Context Length | 128K |
+| Vocab Size | 128,256 |
+| Precision | bfloat16 |
+| RoPE Theta | 50,000,000 |
+## What Was Extracted
+The original VLM consists of:
+- **Vision Encoder**: Qwen2.5-VL based (~600M params) - **removed**
+- **MM Projector**: Multimodal projection layers - **removed**
+- **Language Model**: HyperCLOVAX LLM (~33B params) - **extracted** ✓
+Only the `model.language_model.*` weights were extracted and remapped to standard LLaMA format.
+## Usage
+### With Transformers
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+model_id = "minpeter/HyperCLOVAX-SEED-Text-Think-32B-hf"
+tokenizer = AutoTokenizer.from_pretrained(model_id)
+model = AutoModelForCausalLM.from_pretrained(
+    model_id,
+    torch_dtype="bfloat16",
+    device_map="auto"
+)
+messages = [{"role": "user", "content": "What is the capital of South Korea?"}]
+inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True)
+outputs = model.generate(inputs.to(model.device), max_new_tokens=512)
+print(tokenizer.decode(outputs[0], skip_special_tokens=True))
+```
+### With vLLM
+```bash
+vllm serve minpeter/HyperCLOVAX-SEED-Text-Think-32B-hf \
+    --dtype bfloat16 \
+    --tensor-parallel-size 2
+```
+```python
+from openai import OpenAI
+client = OpenAI(base_url="http://localhost:8000/v1", api_key="dummy")
+response = client.chat.completions.create(
+    model="minpeter/HyperCLOVAX-SEED-Text-Think-32B-hf",
+    messages=[{"role": "user", "content": "안녕하세요! 한국어로 대화할 수 있나요?"}]
+)
+print(response.choices[0].message.content)
+```
+## Thinking Mode
+The model supports a "thinking mode" for complex reasoning tasks. Use the `<|thinking|>` token to trigger extended reasoning:
+```python
+messages = [
+    {"role": "user", "content": "Solve this step by step: If x + 2y = 10 and 3x - y = 5, find x and y."}
+]
+# The model may produce <|thinking|>...</|thinking|> blocks with its reasoning process
+```
+## Hardware Requirements
+- **Minimum**: 2x NVIDIA A100 40GB (with tensor parallelism)
+- **Recommended**: 2x NVIDIA A100 80GB or 4x NVIDIA A6000
+## Limitations
+- This is a **text-only** model. It cannot process images or videos.
+- The model inherits any limitations from the original HyperCLOVAX-SEED-Think-32B.
+- Optimized primarily for Korean and English.
+## License
+This model inherits the [HyperCLOVAX license](https://huggingface.co/naver-hyperclovax/HyperCLOVAX-SEED-Think-32B/blob/main/LICENSE) from the original model.
+## Citation
+If you use this model, please cite the original:
+```bibtex
+@misc{hyperclovax-seed-think-32b,
+  title={HyperCLOVA X SEED Think 32B},
+  author={NAVER Cloud},
+  year={2025},
+  url={https://huggingface.co/naver-hyperclovax/HyperCLOVAX-SEED-Think-32B}
+}
+```
+## Acknowledgments
+- Original model by [NAVER Cloud HyperCLOVA X](https://huggingface.co/naver-hyperclovax)
+- Extraction performed to enable text-only inference without vision dependencies