Akicou commited on
Commit
3c49fda
·
verified ·
1 Parent(s): 72a0772

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +196 -0
README.md ADDED
@@ -0,0 +1,196 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ - ko
5
+ license: other
6
+ license_name: solar-apache-2.0
7
+ tags:
8
+ - upstage
9
+ - solar
10
+ - moe
11
+ - 100b
12
+ - llm
13
+ ---
14
+
15
+ <p align="center">
16
+ <img src="./Solar-Open-69B-REAP.png" alt="Solar Open Model" width="100%">
17
+ </p>
18
+
19
+ # **Solar Open to 69B Reap**
20
+
21
+ **Solar Open** is Upstage's flagship **102B-parameter** and has been REAP'ed to a 69B Model using a modified Repository of Cerebras's REAP Repo on github
22
+
23
+ ## Links to Quants:
24
+
25
+ - Coming soon
26
+
27
+ # **Solar Open**
28
+
29
+ **Solar Open** is Upstage's flagship **102B-parameter** large language model, trained **entirely from scratch** and released under the **Solar-Apache License 2.0** (see [LICENSE](#license) for details). As a **Mixture-of-Experts (MoE)** architecture, it delivers enterprise-grade performance in reasoning, instruction-following, and agentic capabilities—all while prioritizing transparency and customization for the open-source community.
30
+
31
+ ## Highlights
32
+
33
+ * **MoE Architecture (102B / 12B):** Built on a Mixture-of-Experts architecture with **102B total / 12B active parameters**. This design delivers the knowledge depth of a massive model with the inference speed and cost-efficiency of a much smaller model.
34
+ * **Massive Training Scale:** Pre-trained on **19.7 trillion tokens**, ensuring broad knowledge coverage and robust reasoning capabilities across various domains.
35
+
36
+ ## Model Overview
37
+
38
+ * **Model Name:** Solar Open 100B
39
+ * **Hugging Face ID:** Upstage/Solar-Open-100B
40
+ * **Architecture:** Mixture-of-Experts (MoE)
41
+ * **Total Parameters:** 102.6B
42
+ * **Active Parameters:** 12B (per token)
43
+ * **Experts:** 129 Experts (top 8 among 128 Routed + 1 Shared)
44
+ * **Pre-training Tokens:** 19.7 Trillion
45
+ * **Context Length:** 128k
46
+ * **Training Hardware:** NVIDIA B200 GPUs
47
+ * **License:** **Solar-Apache License 2.0** (See [LICENSE](./LICENSE))
48
+ * **Hardware Requirements:**
49
+ * **Minimum:** 4x NVIDIA A100 (80GB)
50
+
51
+ ## License
52
+ This repository contains both model weights and code,
53
+ which are licensed under different terms:
54
+
55
+ 1. MODEL WEIGHTS (*.safetensors)
56
+ Licensed under **Solar-Apache License 2.0**
57
+ See: https://huggingface.co/upstage/Solar-Open-100B/blob/main/LICENSE
58
+
59
+ 2. CODE (*.py, *.json, *.jinja files)
60
+ Licensed under **Apache License 2.0**
61
+ See: https://www.apache.org/licenses/LICENSE-2.0
62
+
63
+ ## Performance
64
+
65
+ TBA
66
+
67
+ ## Inference Quickstart
68
+
69
+ We recommend using the following generation parameters:
70
+
71
+ ```
72
+ temperature=0.8
73
+ top_p=0.95
74
+ top_k=50
75
+ ```
76
+
77
+ ### Transformers
78
+
79
+ Install the required dependencies:
80
+
81
+ ```bash
82
+ pip install -U transformers kernels torch accelerate
83
+ ```
84
+
85
+ Run inference with the following code:
86
+
87
+ ```python
88
+ import torch
89
+ from transformers import AutoModelForCausalLM, AutoTokenizer
90
+
91
+ MODEL_ID = "upstage/Solar-Open-100B"
92
+
93
+ # Load model and tokenizer
94
+ tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
95
+
96
+ model = AutoModelForCausalLM.from_pretrained(
97
+ pretrained_model_name_or_path=MODEL_ID,
98
+ torch_dtype=torch.bfloat16,
99
+ device_map="auto",
100
+ trust_remote_code=True,
101
+ )
102
+
103
+ # Prepare input
104
+ messages = [{"role": "user", "content": "who are you?"}]
105
+ inputs = tokenizer.apply_chat_template(
106
+ messages,
107
+ tokenize=True,
108
+ add_generation_prompt=True,
109
+ return_dict=True,
110
+ return_tensors="pt",
111
+ )
112
+ inputs = inputs.to(model.device)
113
+
114
+ # Generate response
115
+ generated_ids = model.generate(
116
+ **inputs,
117
+ max_new_tokens=4096,
118
+ temperature=0.8,
119
+ top_p=0.95,
120
+ top_k=50,
121
+ do_sample=True,
122
+ )
123
+ generated_text = tokenizer.decode(generated_ids[0][inputs.input_ids.shape[1] :])
124
+ print(generated_text)
125
+ ```
126
+
127
+ ### vLLM
128
+
129
+ #### Option 1: Using Docker (Highly Recommended)
130
+ Docker is the **recommended deployment method** for running `Solar-Open-100B`.
131
+
132
+ ```bash
133
+ # For 8 GPUs
134
+ docker run --gpus all \
135
+ --ipc=host \
136
+ -p 8000:8000 \
137
+ upstage/vllm-solar-open:latest \
138
+ upstage/Solar-Open-100B \
139
+ --trust-remote-code \
140
+ --enable-auto-tool-choice \
141
+ --tool-call-parser solar_open \
142
+ --reasoning-parser solar_open \
143
+ --logits-processors vllm.model_executor.models.parallel_tool_call_logits_processor:ParallelToolCallLogitsProcessor \
144
+ --logits-processors vllm.model_executor.models.solar_open_logits_processor:SolarOpenTemplateLogitsProcessor \
145
+ --tensor-parallel-size 8
146
+ ```
147
+
148
+ #### Option 2: Installing from Source
149
+ For development, debugging, custom modifications or offline inference, Solar Open can also be run
150
+ using a source installation of vLLM. We recommend using **[uv](https://docs.astral.sh/uv/)** for environment
151
+ management and dependency resolution.
152
+
153
+ Create and activate a Python virtual environment
154
+ ```bash
155
+ uv venv --python 3.12 --seed
156
+ source .venv/bin/activate
157
+ ```
158
+
159
+ Install Solar Open's optimized vLLM
160
+ ```bash
161
+ VLLM_PRECOMPILED_WHEEL_LOCATION="https://github.com/vllm-project/vllm/releases/download/v0.12.0/vllm-0.12.0-cp38-abi3-manylinux_2_31_x86_64.whl" \
162
+ VLLM_USE_PRECOMPILED=1 \
163
+ uv pip install git+https://github.com/UpstageAI/vllm.git@v0.12.0-solar-open
164
+ ```
165
+
166
+ Start the vLLM server (For 8 GPUs)
167
+ ```bash
168
+ vllm serve upstage/Solar-Open-100B \
169
+ --trust-remote-code \
170
+ --enable-auto-tool-choice \
171
+ --tool-call-parser solar_open \
172
+ --reasoning-parser solar_open \
173
+ --logits-processors vllm.model_executor.models.parallel_tool_call_logits_processor:ParallelToolCallLogitsProcessor \
174
+ --logits-processors vllm.model_executor.models.solar_open_logits_processor:SolarOpenTemplateLogitsProcessor \
175
+ --tensor-parallel-size 8
176
+ ```
177
+
178
+ ## Public API Access
179
+
180
+ The official API service for Solar Open is scheduled to launch publicly on **January**.
181
+
182
+ * **Access:** Upstage Console (TBA)
183
+ * **Documentation:** Upstage Console (TBA)
184
+
185
+ ## Citation
186
+
187
+ If you use Solar Open in your research, please cite:
188
+
189
+ ```bibtex
190
+ @misc{solar-open-2025,
191
+ title={Solar Open: Scaling Upstage's LLM Capabilities with MoE},
192
+ author={Upstage AI},
193
+ year={2025},
194
+ url={https://huggingface.co/Upstage/Solar-Open-100B}
195
+ }
196
+ ```