Trouter-Library commited on
Commit
9bcba4a
·
verified ·
1 Parent(s): 44b08c5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +261 -3
README.md CHANGED
@@ -1,3 +1,261 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ ---
4
+ # Helion-2.5-Rnd
5
+
6
+ **DeepXR/Helion-2.5-Rnd** - Advanced Research & Development Language Model
7
+
8
+ ## Overview
9
+
10
+ Helion-2.5-Rnd is a cutting-edge research language model designed for exceptional performance across multiple domains including:
11
+
12
+ - **Advanced Reasoning**: Complex problem-solving and logical deduction
13
+ - **Code Generation**: Multi-language programming assistance
14
+ - **Mathematical Computation**: Proof generation and symbolic mathematics
15
+ - **Multilingual Understanding**: 50+ languages with cultural context
16
+ - **Creative Writing**: Story generation, poetry, and content creation
17
+ - **Scientific Analysis**: Research paper understanding and synthesis
18
+ - **Long Context**: Up to 131K tokens of context window
19
+
20
+ ## Model Architecture
21
+
22
+ - **Type**: Transformer-based causal language model
23
+ - **Parameters**: 70B+ parameters
24
+ - **Architecture**: LLaMA-based with YARN positional embeddings
25
+ - **Context Window**: 131,072 tokens (128K)
26
+ - **Precision**: BF16/FP16 with INT8/INT4 quantization support
27
+ - **Training Data**: 2.5 trillion tokens across diverse domains
28
+
29
+ ## Quick Start
30
+
31
+ ### Installation
32
+
33
+ ```bash
34
+ # Clone the repository
35
+ git clone https://huggingface.co/DeepXR/Helion-2.5-Rnd
36
+ cd Helion-2.5-Rnd
37
+
38
+ # Install dependencies
39
+ pip install -r requirements.txt
40
+
41
+ # Or use Docker
42
+ docker build -t helion:2.5-rnd .
43
+ ```
44
+
45
+ ### Running the Server
46
+
47
+ #### Using Python
48
+
49
+ ```bash
50
+ python -m inference.server \
51
+ --model /path/to/model \
52
+ --tensor-parallel-size 2 \
53
+ --max-model-len 131072 \
54
+ --gpu-memory-utilization 0.95
55
+ ```
56
+
57
+ #### Using Docker
58
+
59
+ ```bash
60
+ docker run -d \
61
+ --gpus all \
62
+ -p 8000:8000 \
63
+ -v /path/to/model:/models/helion \
64
+ -e MODEL_PATH=/models/helion \
65
+ -e TENSOR_PARALLEL_SIZE=2 \
66
+ helion:2.5-rnd
67
+ ```
68
+
69
+ ### Using the Client
70
+
71
+ ```python
72
+ from inference.client import HelionClient, HelionAssistant
73
+
74
+ # Basic client
75
+ client = HelionClient(base_url="http://localhost:8000")
76
+
77
+ # Simple completion
78
+ response = client.complete(
79
+ "Explain quantum entanglement:",
80
+ temperature=0.7,
81
+ max_tokens=500
82
+ )
83
+
84
+ # Chat interface
85
+ messages = [
86
+ {"role": "system", "content": "You are a helpful AI assistant."},
87
+ {"role": "user", "content": "What is machine learning?"}
88
+ ]
89
+ response = client.chat(messages=messages)
90
+
91
+ # High-level assistant
92
+ assistant = HelionAssistant()
93
+ response = assistant.chat("Write a Python function for quicksort")
94
+ ```
95
+
96
+ ## API Endpoints
97
+
98
+ ### Chat Completions
99
+
100
+ ```bash
101
+ curl -X POST http://localhost:8000/v1/chat/completions \
102
+ -H "Content-Type: application/json" \
103
+ -d '{
104
+ "model": "DeepXR/Helion-2.5-Rnd",
105
+ "messages": [
106
+ {"role": "user", "content": "Hello, how are you?"}
107
+ ],
108
+ "temperature": 0.7,
109
+ "max_tokens": 1000
110
+ }'
111
+ ```
112
+
113
+ ### Text Completions
114
+
115
+ ```bash
116
+ curl -X POST http://localhost:8000/v1/completions \
117
+ -H "Content-Type: application/json" \
118
+ -d '{
119
+ "model": "DeepXR/Helion-2.5-Rnd",
120
+ "prompt": "Once upon a time",
121
+ "temperature": 0.8,
122
+ "max_tokens": 500
123
+ }'
124
+ ```
125
+
126
+ ### Health Check
127
+
128
+ ```bash
129
+ curl http://localhost:8000/health
130
+ ```
131
+
132
+ ## Configuration
133
+
134
+ ### Model Parameters
135
+
136
+ See `model_config.yaml` for full configuration options:
137
+
138
+ - **Temperature**: 0.0-2.0 (default: 0.7)
139
+ - **Top-p**: 0.0-1.0 (default: 0.9)
140
+ - **Top-k**: Integer (default: 50)
141
+ - **Max Tokens**: 1-131072 (default: 4096)
142
+ - **Repetition Penalty**: 1.0-2.0 (default: 1.1)
143
+
144
+ ### Hardware Requirements
145
+
146
+ **Minimum**:
147
+ - 2x NVIDIA A100 80GB GPUs
148
+ - 256GB RAM
149
+ - 500GB NVMe SSD
150
+
151
+ **Recommended**:
152
+ - 4x NVIDIA H100 80GB GPUs
153
+ - 512GB RAM
154
+ - 1TB NVMe SSD
155
+
156
+ ## Capabilities
157
+
158
+ ### Code Generation
159
+
160
+ ```python
161
+ messages = [
162
+ {"role": "user", "content": "Write a binary search tree implementation in Rust"}
163
+ ]
164
+ response = client.chat(messages=messages, temperature=0.3)
165
+ ```
166
+
167
+ ### Mathematical Reasoning
168
+
169
+ ```python
170
+ response = client.complete(
171
+ "Prove that the square root of 2 is irrational using contradiction:",
172
+ temperature=0.5
173
+ )
174
+ ```
175
+
176
+ ### Creative Writing
177
+
178
+ ```python
179
+ response = client.complete(
180
+ "Write a haiku about artificial intelligence:",
181
+ temperature=0.9
182
+ )
183
+ ```
184
+
185
+ ### Multilingual Support
186
+
187
+ Helion supports 50+ languages including:
188
+ - English, Spanish, French, German, Italian
189
+ - Chinese (Simplified & Traditional), Japanese, Korean
190
+ - Arabic, Hebrew, Hindi, Russian
191
+ - And many more...
192
+
193
+ ## Benchmarks
194
+
195
+ | Benchmark | Score |
196
+ |-----------|-------|
197
+ | MMLU | 84.7% |
198
+ | GSM8K | 89.2% |
199
+ | HumanEval | 75.6% |
200
+ | MBPP | 72.3% |
201
+ | ARC Challenge | 83.4% |
202
+ | HellaSwag | 88.9% |
203
+ | TruthfulQA | 61.2% |
204
+
205
+ ## Safety and Limitations
206
+
207
+ ### Safety Features
208
+ - Content filtering for harmful outputs
209
+ - PII (Personally Identifiable Information) detection
210
+ - Prompt injection protection
211
+ - Toxicity thresholds
212
+
213
+ ### Known Limitations
214
+ - This is a **research model** - outputs should be verified
215
+ - May exhibit biases present in training data
216
+ - Performance on highly specialized domains may vary
217
+ - Long context (>64K tokens) performance degrades
218
+ - Not suitable for production without further fine-tuning
219
+
220
+ ## Research Use
221
+
222
+ This model is intended for **research and development purposes**. It represents an experimental version of the Helion architecture and is continuously being improved.
223
+
224
+ ### Citation
225
+
226
+ If you use this model in your research, please cite:
227
+
228
+ ```bibtex
229
+ @misc{helion-2.5-rnd,
230
+ title={Helion-2.5-Rnd: Advanced Research Language Model},
231
+ author={DeepXR Team},
232
+ year={2025},
233
+ publisher={DeepXR},
234
+ url={https://huggingface.co/DeepXR/Helion-2.5-Rnd}
235
+ }
236
+ ```
237
+
238
+ ## License
239
+
240
+ This model is released under the **Apache License 2.0**. See [LICENSE](LICENSE) for full details.
241
+
242
+ ## Support
243
+
244
+ - **Documentation**: See `docs/` directory
245
+ - **Issues**: Report on GitHub Issues
246
+ - **Community**: Join our Discord/Slack
247
+ - **Email**: support@deepxr.ai
248
+
249
+ ## Acknowledgments
250
+
251
+ Built upon the excellent work of:
252
+ - Meta AI (LLaMA architecture)
253
+ - Hugging Face (Transformers library)
254
+ - vLLM team (High-performance inference)
255
+ - The open-source AI community
256
+
257
+ ---
258
+
259
+ **DeepXR** - Advancing AI Research
260
+
261
+ Version: 2.5.0-rnd | Status: Research | Updated: 2025-01-30