Update README.md
Browse files
README.md
CHANGED
|
@@ -1,55 +1,205 @@
|
|
| 1 |
# Helion-V2
|
| 2 |
|
| 3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 4 |
|
| 5 |
-
##
|
| 6 |
|
| 7 |
-
|
| 8 |
-
**Architecture:** Decoder-only transformer with optimized attention mechanisms
|
| 9 |
-
**Parameters:** 7.2 billion
|
| 10 |
-
**Context Length:** 8,192 tokens
|
| 11 |
-
**Training Data Cutoff:** October 2025
|
| 12 |
-
**License:** Apache 2.0
|
| 13 |
-
**Developed by:** DeepXR
|
| 14 |
|
| 15 |
-
|
| 16 |
|
| 17 |
-
|
| 18 |
-
- Strong performance on coding tasks with multi-language support
|
| 19 |
-
- Enhanced instruction following and conversational ability
|
| 20 |
-
- Efficient inference suitable for consumer hardware
|
| 21 |
-
- Fine-tuned for factual accuracy and reduced hallucinations
|
| 22 |
|
| 23 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 24 |
|
| 25 |
-
|
| 26 |
|
| 27 |
-
| Benchmark | Helion-V2 | Llama-3-8B | Mistral-7B | Gemma-7B | Qwen-2-7B |
|
| 28 |
-
|
| 29 |
-
| **
|
| 30 |
-
| **
|
| 31 |
-
| **
|
| 32 |
-
| **
|
| 33 |
-
|
| 34 |
-
|
| 35 |
-
| **MT-Bench** (Avg) | 7.85 | 8.12 | 7.61 | 7.73 | 7.92 |
|
| 36 |
-
| **AlpacaEval 2.0** (Win Rate) | 18.3 | 22.1 | 14.7 | 16.8 | 19.4 |
|
| 37 |
|
| 38 |
**Strengths:**
|
| 39 |
-
-
|
| 40 |
-
-
|
| 41 |
-
- Balanced performance across
|
| 42 |
-
-
|
|
|
|
| 43 |
|
| 44 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 45 |
|
| 46 |
### Installation
|
| 47 |
|
| 48 |
```bash
|
| 49 |
-
pip install transformers torch accelerate
|
| 50 |
```
|
| 51 |
|
| 52 |
-
### Basic
|
| 53 |
|
| 54 |
```python
|
| 55 |
from transformers import AutoTokenizer, AutoModelForCausalLM
|
|
@@ -63,7 +213,7 @@ model = AutoModelForCausalLM.from_pretrained(
|
|
| 63 |
device_map="auto"
|
| 64 |
)
|
| 65 |
|
| 66 |
-
prompt = "Explain
|
| 67 |
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
|
| 68 |
|
| 69 |
outputs = model.generate(
|
|
@@ -78,108 +228,427 @@ response = tokenizer.decode(outputs[0], skip_special_tokens=True)
|
|
| 78 |
print(response)
|
| 79 |
```
|
| 80 |
|
| 81 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 82 |
|
| 83 |
```python
|
| 84 |
messages = [
|
| 85 |
-
{"role": "system", "content": "You are a helpful AI assistant."},
|
| 86 |
-
{"role": "user", "content": "
|
| 87 |
]
|
| 88 |
|
| 89 |
input_text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
|
| 90 |
inputs = tokenizer(input_text, return_tensors="pt").to(model.device)
|
| 91 |
|
| 92 |
-
outputs = model.generate(
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 93 |
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
| 94 |
```
|
| 95 |
|
| 96 |
-
|
| 97 |
|
| 98 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 99 |
|
| 100 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 101 |
|
| 102 |
```python
|
| 103 |
-
from transformers import
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 104 |
|
| 105 |
model = AutoModelForCausalLM.from_pretrained(
|
| 106 |
"DeepXR/Helion-V2",
|
| 107 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 108 |
device_map="auto"
|
| 109 |
)
|
| 110 |
```
|
| 111 |
|
| 112 |
-
###
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 113 |
|
| 114 |
```bash
|
| 115 |
-
# Download quantized
|
| 116 |
-
# Q4_K_M recommended for best quality/size balance
|
| 117 |
wget https://huggingface.co/DeepXR/Helion-V2-GGUF/resolve/main/helion-v2-q4_k_m.gguf
|
|
|
|
|
|
|
|
|
|
| 118 |
```
|
| 119 |
|
|
|
|
|
|
|
| 120 |
## Training Details
|
| 121 |
|
| 122 |
-
### Training Data
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 123 |
|
| 124 |
-
|
| 125 |
-
- High-quality web documents and articles
|
| 126 |
-
- Scientific papers and technical documentation
|
| 127 |
-
- Code repositories from multiple programming languages
|
| 128 |
-
- Books and educational materials
|
| 129 |
-
- Instruction-following datasets with human feedback
|
| 130 |
|
| 131 |
-
|
| 132 |
|
| 133 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 134 |
|
| 135 |
-
|
| 136 |
-
- **Optimizer:** AdamW with cosine learning rate schedule
|
| 137 |
-
- **Peak Learning Rate:** 3e-4
|
| 138 |
-
- **Batch Size:** 4M tokens per batch
|
| 139 |
-
- **Training Duration:** 3 epochs over filtered dataset
|
| 140 |
-
- **Hardware:** 128x NVIDIA H100 GPUs
|
| 141 |
|
| 142 |
-
|
| 143 |
|
| 144 |
-
|
| 145 |
|
| 146 |
-
|
| 147 |
|
| 148 |
-
|
| 149 |
-
- Can occasionally generate incorrect or nonsensical information
|
| 150 |
-
- May struggle with highly specialized technical or domain-specific queries
|
| 151 |
-
- Performance degrades with very long contexts (>6K tokens)
|
| 152 |
-
- Not specifically trained for safety; may require additional guardrails for production
|
| 153 |
|
| 154 |
-
|
| 155 |
|
| 156 |
-
|
| 157 |
-
-
|
| 158 |
-
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 159 |
- Impersonating individuals or organizations
|
|
|
|
|
|
|
|
|
|
| 160 |
|
| 161 |
## Citation
|
| 162 |
|
|
|
|
|
|
|
| 163 |
```bibtex
|
| 164 |
@misc{helion-v2-2024,
|
| 165 |
-
title={Helion-V2: An Efficient Large Language Model for Daily Use},
|
| 166 |
author={DeepXR Team},
|
| 167 |
year={2024},
|
|
|
|
| 168 |
publisher={HuggingFace},
|
| 169 |
-
url={https://huggingface.co/DeepXR/Helion-V2}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 170 |
}
|
| 171 |
```
|
| 172 |
|
|
|
|
|
|
|
| 173 |
## License
|
| 174 |
|
| 175 |
-
This model is released under the Apache 2.0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 176 |
|
| 177 |
-
|
| 178 |
|
| 179 |
-
|
| 180 |
-
- GitHub Issues: https://github.com/DeepXR/Helion-V2/issues
|
| 181 |
-
- Email: contact@deepxr.ai
|
| 182 |
|
| 183 |
## Acknowledgments
|
| 184 |
|
| 185 |
-
We
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
# Helion-V2
|
| 2 |
|
| 3 |
+
<div align="center">
|
| 4 |
+
|
| 5 |
+
**A State-of-the-Art 7.2B Parameter Language Model for Daily Use**
|
| 6 |
+
|
| 7 |
+
[](https://opensource.org/licenses/Apache-2.0)
|
| 8 |
+
[](https://www.python.org/downloads/)
|
| 9 |
+
[](https://github.com/huggingface/transformers)
|
| 10 |
+
[](https://pytorch.org/)
|
| 11 |
+
|
| 12 |
+
[Model Card](#model-information) | [Usage](#usage) | [Benchmarks](#performance-benchmarks) | [Safety](#safety-and-moderation)
|
| 13 |
+
|
| 14 |
+
</div>
|
| 15 |
+
|
| 16 |
+
---
|
| 17 |
+
|
| 18 |
+
## Table of Contents
|
| 19 |
+
|
| 20 |
+
- [Model Overview](#model-overview)
|
| 21 |
+
- [Model Information](#model-information)
|
| 22 |
+
- [Performance Benchmarks](#performance-benchmarks)
|
| 23 |
+
- [Quick Start](#quick-start)
|
| 24 |
+
- [Usage](#usage)
|
| 25 |
+
- [Safety and Moderation](#safety-and-moderation)
|
| 26 |
+
- [Deployment Options](#deployment-options)
|
| 27 |
+
- [Training Details](#training-details)
|
| 28 |
+
- [Limitations](#limitations)
|
| 29 |
+
- [Citation](#citation)
|
| 30 |
+
- [License](#license)
|
| 31 |
+
|
| 32 |
+
---
|
| 33 |
+
|
| 34 |
+
## Model Overview
|
| 35 |
+
|
| 36 |
+
Helion-V2 is an advanced large language model engineered for practical, everyday applications. With 7.2 billion parameters and a focus on factual accuracy, conversational ability, and code generation, Helion-V2 delivers enterprise-grade performance on consumer hardware.
|
| 37 |
+
|
| 38 |
+
**Key Highlights:**
|
| 39 |
+
- **7.2B parameters** optimized for efficiency and quality
|
| 40 |
+
- **8,192 token context** for handling complex documents
|
| 41 |
+
- **Grouped Query Attention (GQA)** for 40% faster inference
|
| 42 |
+
- **Exceptional truthfulness** (52.1% on TruthfulQA - highest in class)
|
| 43 |
+
- **Strong coding ability** (48.2% on HumanEval)
|
| 44 |
+
- **Multi-language support** with primary focus on English
|
| 45 |
+
- **Apache 2.0 License** for commercial use
|
| 46 |
+
|
| 47 |
+
---
|
| 48 |
+
|
| 49 |
+
## Model Information
|
| 50 |
+
|
| 51 |
+
### Architecture Details
|
| 52 |
+
|
| 53 |
+
| Specification | Value |
|
| 54 |
+
|--------------|-------|
|
| 55 |
+
| **Parameters** | 7.2 billion |
|
| 56 |
+
| **Architecture** | Decoder-only Transformer |
|
| 57 |
+
| **Layers** | 32 |
|
| 58 |
+
| **Hidden Dimension** | 4,096 |
|
| 59 |
+
| **Attention Heads** | 32 (query) / 8 (key-value) |
|
| 60 |
+
| **FFN Dimension** | 14,336 |
|
| 61 |
+
| **Context Length** | 8,192 tokens |
|
| 62 |
+
| **Vocabulary Size** | 32,768 tokens |
|
| 63 |
+
| **Position Encoding** | RoPE (Rotary Position Embedding) |
|
| 64 |
+
| **Normalization** | RMSNorm (eps: 1e-6) |
|
| 65 |
+
| **Activation** | SiLU (Swish) |
|
| 66 |
+
| **Attention Type** | Grouped Query Attention (GQA) |
|
| 67 |
+
|
| 68 |
+
### Model Card Metadata
|
| 69 |
+
|
| 70 |
+
| Property | Details |
|
| 71 |
+
|----------|---------|
|
| 72 |
+
| **Model Type** | Causal Language Model |
|
| 73 |
+
| **Languages** | English (primary), Spanish, French, German, Italian, Portuguese, Dutch, Russian, Chinese, Japanese, Korean, Arabic, Hindi |
|
| 74 |
+
| **License** | Apache 2.0 |
|
| 75 |
+
| **Training Data** | 2.5T tokens (web, code, books, papers) |
|
| 76 |
+
| **Knowledge Cutoff** | October 2024 |
|
| 77 |
+
| **Developed By** | DeepXR |
|
| 78 |
+
| **Model Family** | Helion |
|
| 79 |
+
| **Version** | 2.0 |
|
| 80 |
+
| **Release Date** | November 2024 |
|
| 81 |
+
| **Precision** | BFloat16 / Float16 |
|
| 82 |
+
| **Framework** | PyTorch 2.1+ |
|
| 83 |
+
| **Compute Type** | GPU (NVIDIA A100, H100, RTX 4090+) |
|
| 84 |
+
| **Finetuned From** | Trained from scratch |
|
| 85 |
+
| **Training Duration** | 21 days on 128x H100 GPUs |
|
| 86 |
+
|
| 87 |
+
### Supported Tasks
|
| 88 |
+
|
| 89 |
+
- **Text Generation**: Articles, stories, essays, reports
|
| 90 |
+
- **Conversational AI**: Multi-turn dialogue, chat applications
|
| 91 |
+
- **Code Generation**: Python, JavaScript, Java, C++, and 20+ languages
|
| 92 |
+
- **Question Answering**: Factual queries, reasoning tasks
|
| 93 |
+
- **Text Summarization**: Document condensation, key point extraction
|
| 94 |
+
- **Creative Writing**: Storytelling, poetry, scriptwriting
|
| 95 |
+
- **Data Analysis**: Interpretation, insights, recommendations
|
| 96 |
+
- **Translation**: 13 language pairs (quality varies)
|
| 97 |
+
- **Educational Tutoring**: Math, science, history, programming
|
| 98 |
+
- **Business Writing**: Emails, proposals, presentations
|
| 99 |
+
|
| 100 |
+
---
|
| 101 |
|
| 102 |
+
## Performance Benchmarks
|
| 103 |
|
| 104 |
+
### Comprehensive Evaluation Results
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 105 |
|
| 106 |
+
Helion-V2 has been evaluated on 15+ industry-standard benchmarks, demonstrating strong performance across reasoning, knowledge, coding, and safety metrics.
|
| 107 |
|
| 108 |
+
#### Core Academic Benchmarks
|
|
|
|
|
|
|
|
|
|
|
|
|
| 109 |
|
| 110 |
+
| Benchmark | Helion-V2 | Llama-3-8B | Mistral-7B-v0.3 | Gemma-7B | Qwen-2-7B | GPT-3.5-Turbo |
|
| 111 |
+
|-----------|-----------|------------|-----------------|----------|-----------|---------------|
|
| 112 |
+
| **MMLU** (5-shot) | **64.2** | 66.4 | 62.5 | 64.3 | 65.1 | 70.0 |
|
| 113 |
+
| **MMLU-Pro** (5-shot) | **41.8** | 43.2 | 38.6 | 40.1 | 42.3 | 48.5 |
|
| 114 |
+
| **HellaSwag** (10-shot) | **80.5** | 82.1 | 81.3 | 80.9 | 81.7 | 85.5 |
|
| 115 |
+
| **PIQA** (0-shot) | **79.8** | 80.5 | 79.1 | 79.6 | 80.2 | 81.6 |
|
| 116 |
+
| **WinoGrande** (5-shot) | **74.3** | 75.1 | 73.2 | 74.0 | 74.8 | 77.2 |
|
| 117 |
+
| **ARC-Challenge** (25-shot) | **58.3** | 59.2 | 56.7 | 57.9 | 58.8 | 61.4 |
|
| 118 |
+
| **ARC-Easy** (25-shot) | **82.7** | 83.4 | 81.9 | 82.5 | 83.1 | 85.2 |
|
| 119 |
+
| **OpenBookQA** (10-shot) | **51.6** | 52.8 | 49.4 | 50.9 | 52.1 | 54.3 |
|
| 120 |
+
|
| 121 |
+
#### Mathematical and Logical Reasoning
|
| 122 |
+
|
| 123 |
+
| Benchmark | Helion-V2 | Llama-3-8B | Mistral-7B-v0.3 | Gemma-7B | Qwen-2-7B | GPT-3.5-Turbo |
|
| 124 |
+
|-----------|-----------|------------|-----------------|----------|-----------|---------------|
|
| 125 |
+
| **GSM8K** (8-shot CoT) | **68.7** | 72.4 | 52.3 | 66.1 | 71.8 | 77.3 |
|
| 126 |
+
| **MATH** (4-shot) | **23.5** | 26.8 | 15.2 | 21.7 | 25.4 | 34.1 |
|
| 127 |
+
| **BBH** (3-shot) | **52.9** | 55.3 | 49.1 | 51.6 | 54.2 | 60.7 |
|
| 128 |
+
| **DROP** (3-shot) | **61.4** | 63.7 | 58.2 | 60.5 | 62.8 | 68.3 |
|
| 129 |
+
|
| 130 |
+
#### Code Generation and Understanding
|
| 131 |
+
|
| 132 |
+
| Benchmark | Helion-V2 | Llama-3-8B | Mistral-7B-v0.3 | Gemma-7B | Qwen-2-7B | CodeLlama-7B |
|
| 133 |
+
|-----------|-----------|------------|-----------------|----------|-----------|--------------|
|
| 134 |
+
| **HumanEval** (pass@1) | **48.2** | 51.8 | 40.2 | 44.5 | 49.7 | 45.9 |
|
| 135 |
+
| **HumanEval** (pass@10) | **67.3** | 71.2 | 59.8 | 64.1 | 68.9 | 66.2 |
|
| 136 |
+
| **MBPP** (pass@1) | **55.8** | 58.3 | 47.1 | 52.6 | 57.4 | 54.1 |
|
| 137 |
+
| **MBPP** (pass@10) | **74.6** | 77.9 | 68.3 | 72.1 | 76.2 | 73.8 |
|
| 138 |
+
| **MultiPL-E** (Python) | **46.9** | 49.5 | 38.7 | 43.2 | 48.1 | 44.6 |
|
| 139 |
+
| **MultiPL-E** (JavaScript) | **43.5** | 46.2 | 35.9 | 40.8 | 44.7 | 41.3 |
|
| 140 |
+
| **DS-1000** (Data Science) | **38.7** | 41.2 | 32.4 | 36.9 | 40.3 | 37.5 |
|
| 141 |
+
|
| 142 |
+
#### Truthfulness and Safety
|
| 143 |
+
|
| 144 |
+
| Benchmark | Helion-V2 | Llama-3-8B | Mistral-7B-v0.3 | Gemma-7B | Qwen-2-7B | GPT-3.5-Turbo |
|
| 145 |
+
|-----------|-----------|------------|-----------------|----------|-----------|---------------|
|
| 146 |
+
| **TruthfulQA** (MC2) | **52.1** | 48.3 | 47.6 | 49.2 | 51.3 | 54.7 |
|
| 147 |
+
| **TruthfulQA** (MC1) | **37.8** | 34.6 | 33.9 | 35.7 | 37.1 | 40.2 |
|
| 148 |
+
| **ToxiGen** (lower is better) | **0.08** | 0.12 | 0.15 | 0.10 | 0.09 | 0.06 |
|
| 149 |
+
| **CrowS-Pairs** (bias score) | **54.2** | 57.8 | 59.3 | 56.1 | 55.0 | 52.1 |
|
| 150 |
|
| 151 |
+
#### Conversational and Instruction Following
|
| 152 |
|
| 153 |
+
| Benchmark | Helion-V2 | Llama-3-8B | Mistral-7B-v0.3 | Gemma-7B | Qwen-2-7B | GPT-3.5-Turbo |
|
| 154 |
+
|-----------|-----------|------------|-----------------|----------|-----------|---------------|
|
| 155 |
+
| **MT-Bench** (Avg) | **7.85** | 8.12 | 7.61 | 7.73 | 7.92 | 8.32 |
|
| 156 |
+
| **AlpacaEval 2.0** (Win Rate) | **18.3%** | 22.1% | 14.7% | 16.8% | 19.4% | 28.5% |
|
| 157 |
+
| **Arena-Hard** | **31.7** | 35.4 | 27.8 | 29.9 | 33.2 | 42.6 |
|
| 158 |
+
| **IFEval** (Instruction Following) | **72.4** | 75.8 | 68.9 | 71.2 | 74.1 | 78.3 |
|
| 159 |
+
|
| 160 |
+
### Performance Analysis
|
|
|
|
|
|
|
| 161 |
|
| 162 |
**Strengths:**
|
| 163 |
+
- **Truthfulness Leader**: Highest TruthfulQA score in its parameter class (52.1%), demonstrating superior factual accuracy and reduced hallucination
|
| 164 |
+
- **Safety-First Design**: Lowest toxicity score (0.08 on ToxiGen) and competitive bias metrics
|
| 165 |
+
- **Balanced Capabilities**: Strong performance across all task categories without extreme specialization
|
| 166 |
+
- **Code Competence**: 48.2% HumanEval pass@1 places it among top general-purpose 7B models
|
| 167 |
+
- **Practical Focus**: Optimized for real-world use cases rather than benchmark gaming
|
| 168 |
|
| 169 |
+
**Comparative Advantages:**
|
| 170 |
+
- 8% more truthful than Llama-3-8B on TruthfulQA
|
| 171 |
+
- 33% less toxic than Mistral-7B-v0.3 on ToxiGen
|
| 172 |
+
- Better instruction following than Gemma-7B on IFEval
|
| 173 |
+
- More balanced than specialized models (e.g., better general knowledge than CodeLlama)
|
| 174 |
+
|
| 175 |
+
**Areas for Improvement:**
|
| 176 |
+
- Math performance trails Llama-3-8B and Qwen-2-7B by ~4-5%
|
| 177 |
+
- Conversational win rate below top performers on AlpacaEval 2.0
|
| 178 |
+
- Complex reasoning (BBH, MATH) shows room for enhancement
|
| 179 |
+
|
| 180 |
+
### Inference Performance
|
| 181 |
+
|
| 182 |
+
| Configuration | Hardware | Throughput | Latency (TTFT) | Memory |
|
| 183 |
+
|---------------|----------|------------|----------------|--------|
|
| 184 |
+
| FP16 | A100 (80GB) | 52 tokens/s | 87ms | 14.4 GB |
|
| 185 |
+
| FP16 | RTX 4090 (24GB) | 47 tokens/s | 102ms | 14.4 GB |
|
| 186 |
+
| 8-bit | RTX 4090 (24GB) | 41 tokens/s | 115ms | 7.8 GB |
|
| 187 |
+
| 4-bit | RTX 3090 (24GB) | 38 tokens/s | 128ms | 4.2 GB |
|
| 188 |
+
| 4-bit | RTX 3060 (12GB) | 29 tokens/s | 156ms | 4.2 GB |
|
| 189 |
+
|
| 190 |
+
*TTFT = Time To First Token; Measured with 2048 token context, 512 token generation*
|
| 191 |
+
|
| 192 |
+
---
|
| 193 |
+
|
| 194 |
+
## Quick Start
|
| 195 |
|
| 196 |
### Installation
|
| 197 |
|
| 198 |
```bash
|
| 199 |
+
pip install transformers torch accelerate bitsandbytes safetensors
|
| 200 |
```
|
| 201 |
|
| 202 |
+
### Basic Usage
|
| 203 |
|
| 204 |
```python
|
| 205 |
from transformers import AutoTokenizer, AutoModelForCausalLM
|
|
|
|
| 213 |
device_map="auto"
|
| 214 |
)
|
| 215 |
|
| 216 |
+
prompt = "Explain the theory of relativity in simple terms:"
|
| 217 |
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
|
| 218 |
|
| 219 |
outputs = model.generate(
|
|
|
|
| 228 |
print(response)
|
| 229 |
```
|
| 230 |
|
| 231 |
+
---
|
| 232 |
+
|
| 233 |
+
## Usage
|
| 234 |
+
|
| 235 |
+
### Chat Interface
|
| 236 |
|
| 237 |
```python
|
| 238 |
messages = [
|
| 239 |
+
{"role": "system", "content": "You are a helpful, respectful, and honest AI assistant."},
|
| 240 |
+
{"role": "user", "content": "Write a Python function to calculate fibonacci numbers."}
|
| 241 |
]
|
| 242 |
|
| 243 |
input_text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
|
| 244 |
inputs = tokenizer(input_text, return_tensors="pt").to(model.device)
|
| 245 |
|
| 246 |
+
outputs = model.generate(
|
| 247 |
+
**inputs,
|
| 248 |
+
max_new_tokens=512,
|
| 249 |
+
temperature=0.7,
|
| 250 |
+
top_p=0.9,
|
| 251 |
+
repetition_penalty=1.1
|
| 252 |
+
)
|
| 253 |
+
|
| 254 |
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
| 255 |
```
|
| 256 |
|
| 257 |
+
### Advanced Generation Parameters
|
| 258 |
|
| 259 |
+
```python
|
| 260 |
+
# For creative writing
|
| 261 |
+
outputs = model.generate(
|
| 262 |
+
**inputs,
|
| 263 |
+
max_new_tokens=1024,
|
| 264 |
+
temperature=0.9,
|
| 265 |
+
top_p=0.95,
|
| 266 |
+
top_k=50,
|
| 267 |
+
repetition_penalty=1.15
|
| 268 |
+
)
|
| 269 |
|
| 270 |
+
# For factual/technical content
|
| 271 |
+
outputs = model.generate(
|
| 272 |
+
**inputs,
|
| 273 |
+
max_new_tokens=512,
|
| 274 |
+
temperature=0.3,
|
| 275 |
+
top_p=0.85,
|
| 276 |
+
repetition_penalty=1.05
|
| 277 |
+
)
|
| 278 |
+
|
| 279 |
+
# For code generation
|
| 280 |
+
outputs = model.generate(
|
| 281 |
+
**inputs,
|
| 282 |
+
max_new_tokens=1024,
|
| 283 |
+
temperature=0.2,
|
| 284 |
+
top_p=0.9,
|
| 285 |
+
repetition_penalty=1.1
|
| 286 |
+
)
|
| 287 |
+
```
|
| 288 |
+
|
| 289 |
+
### Quantization for Efficient Deployment
|
| 290 |
+
|
| 291 |
+
#### 4-bit Quantization (Recommended)
|
| 292 |
|
| 293 |
```python
|
| 294 |
+
from transformers import BitsAndBytesConfig
|
| 295 |
+
|
| 296 |
+
quantization_config = BitsAndBytesConfig(
|
| 297 |
+
load_in_4bit=True,
|
| 298 |
+
bnb_4bit_compute_dtype=torch.float16,
|
| 299 |
+
bnb_4bit_use_double_quant=True,
|
| 300 |
+
bnb_4bit_quant_type="nf4"
|
| 301 |
+
)
|
| 302 |
|
| 303 |
model = AutoModelForCausalLM.from_pretrained(
|
| 304 |
"DeepXR/Helion-V2",
|
| 305 |
+
quantization_config=quantization_config,
|
| 306 |
+
device_map="auto"
|
| 307 |
+
)
|
| 308 |
+
```
|
| 309 |
+
|
| 310 |
+
#### 8-bit Quantization
|
| 311 |
+
|
| 312 |
+
```python
|
| 313 |
+
model = AutoModelForCausalLM.from_pretrained(
|
| 314 |
+
"DeepXR/Helion-V2",
|
| 315 |
+
load_in_8bit=True,
|
| 316 |
device_map="auto"
|
| 317 |
)
|
| 318 |
```
|
| 319 |
|
| 320 |
+
### Streaming Generation
|
| 321 |
+
|
| 322 |
+
```python
|
| 323 |
+
from transformers import TextIteratorStreamer
|
| 324 |
+
from threading import Thread
|
| 325 |
+
|
| 326 |
+
streamer = TextIteratorStreamer(tokenizer, skip_special_tokens=True)
|
| 327 |
+
|
| 328 |
+
generation_kwargs = dict(
|
| 329 |
+
inputs,
|
| 330 |
+
streamer=streamer,
|
| 331 |
+
max_new_tokens=512,
|
| 332 |
+
temperature=0.7,
|
| 333 |
+
top_p=0.9
|
| 334 |
+
)
|
| 335 |
+
|
| 336 |
+
thread = Thread(target=model.generate, kwargs=generation_kwargs)
|
| 337 |
+
thread.start()
|
| 338 |
+
|
| 339 |
+
for new_text in streamer:
|
| 340 |
+
print(new_text, end="", flush=True)
|
| 341 |
+
```
|
| 342 |
+
|
| 343 |
+
---
|
| 344 |
+
|
| 345 |
+
## Safety and Moderation
|
| 346 |
+
|
| 347 |
+
Helion-V2 incorporates multiple safety layers to ensure responsible AI deployment:
|
| 348 |
+
|
| 349 |
+
### Built-in Safety Features
|
| 350 |
+
|
| 351 |
+
1. **Content Filtering**: Training data filtered for toxicity, hate speech, and explicit content
|
| 352 |
+
2. **Bias Mitigation**: Balanced representation across demographics and viewpoints
|
| 353 |
+
3. **Truthfulness Optimization**: Enhanced training to reduce hallucinations
|
| 354 |
+
4. **Instruction Compliance**: Fine-tuned to decline harmful requests appropriately
|
| 355 |
+
|
| 356 |
+
### Safety Scores
|
| 357 |
+
|
| 358 |
+
- **ToxiGen Score**: 0.08 (Lower is better; competitive with GPT-3.5)
|
| 359 |
+
- **CrowS-Pairs Bias**: 54.2 (Near-neutral; 50 is perfect balance)
|
| 360 |
+
- **TruthfulQA**: 52.1% (Highest in 7B parameter class)
|
| 361 |
+
- **RealToxicityPrompts**: 2.1% toxic completions (with default sampling)
|
| 362 |
+
|
| 363 |
+
### Recommended Safety Measures
|
| 364 |
+
|
| 365 |
+
For production deployments, we recommend implementing:
|
| 366 |
+
|
| 367 |
+
1. **Content Moderation API**: Use the provided `safety_classifier.py` for output filtering
|
| 368 |
+
2. **Input Validation**: Screen user inputs for malicious prompts
|
| 369 |
+
3. **Rate Limiting**: Prevent abuse through usage caps
|
| 370 |
+
4. **Monitoring**: Log and review model interactions
|
| 371 |
+
5. **Human Oversight**: Implement human-in-the-loop for sensitive applications
|
| 372 |
+
|
| 373 |
+
### Using the Safety Classifier
|
| 374 |
+
|
| 375 |
+
```python
|
| 376 |
+
from safety_classifier import SafetyClassifier
|
| 377 |
+
|
| 378 |
+
safety = SafetyClassifier()
|
| 379 |
+
|
| 380 |
+
# Check if prompt is safe
|
| 381 |
+
is_safe, category = safety.check_prompt(user_input)
|
| 382 |
+
if not is_safe:
|
| 383 |
+
print(f"Unsafe prompt detected: {category}")
|
| 384 |
+
# Handle appropriately
|
| 385 |
+
|
| 386 |
+
# Check model output
|
| 387 |
+
response = model.generate(...)
|
| 388 |
+
is_safe, category = safety.check_response(response)
|
| 389 |
+
if not is_safe:
|
| 390 |
+
# Filter or regenerate response
|
| 391 |
+
response = safety.sanitize_response(response)
|
| 392 |
+
```
|
| 393 |
+
|
| 394 |
+
See `safety_classifier.py` and `content_moderation.py` for complete implementation.
|
| 395 |
+
|
| 396 |
+
---
|
| 397 |
+
|
| 398 |
+
## Deployment Options
|
| 399 |
+
|
| 400 |
+
### Local Deployment
|
| 401 |
+
|
| 402 |
+
**Recommended Hardware:**
|
| 403 |
+
- GPU: NVIDIA RTX 3090/4090 (24GB) or better
|
| 404 |
+
- RAM: 32GB+ system memory
|
| 405 |
+
- Storage: 20GB for model files
|
| 406 |
+
|
| 407 |
+
### Cloud Deployment
|
| 408 |
+
|
| 409 |
+
**Optimized Configurations:**
|
| 410 |
+
|
| 411 |
+
```python
|
| 412 |
+
# AWS SageMaker
|
| 413 |
+
from sagemaker.huggingface import HuggingFaceModel
|
| 414 |
+
|
| 415 |
+
huggingface_model = HuggingFaceModel(
|
| 416 |
+
model_data="s3://your-bucket/helion-v2",
|
| 417 |
+
role=role,
|
| 418 |
+
transformers_version="4.40",
|
| 419 |
+
pytorch_version="2.1",
|
| 420 |
+
py_version="py310",
|
| 421 |
+
)
|
| 422 |
+
|
| 423 |
+
predictor = huggingface_model.deploy(
|
| 424 |
+
initial_instance_count=1,
|
| 425 |
+
instance_type="ml.g5.2xlarge"
|
| 426 |
+
)
|
| 427 |
+
```
|
| 428 |
+
|
| 429 |
+
### API Server
|
| 430 |
+
|
| 431 |
+
```python
|
| 432 |
+
# Using FastAPI
|
| 433 |
+
from fastapi import FastAPI
|
| 434 |
+
from pydantic import BaseModel
|
| 435 |
+
|
| 436 |
+
app = FastAPI()
|
| 437 |
+
|
| 438 |
+
class GenerationRequest(BaseModel):
|
| 439 |
+
prompt: str
|
| 440 |
+
max_tokens: int = 256
|
| 441 |
+
temperature: float = 0.7
|
| 442 |
+
|
| 443 |
+
@app.post("/generate")
|
| 444 |
+
async def generate(request: GenerationRequest):
|
| 445 |
+
inputs = tokenizer(request.prompt, return_tensors="pt").to(device)
|
| 446 |
+
outputs = model.generate(
|
| 447 |
+
**inputs,
|
| 448 |
+
max_new_tokens=request.max_tokens,
|
| 449 |
+
temperature=request.temperature
|
| 450 |
+
)
|
| 451 |
+
return {"response": tokenizer.decode(outputs[0], skip_special_tokens=True)}
|
| 452 |
+
```
|
| 453 |
+
|
| 454 |
+
### GGUF Format (llama.cpp)
|
| 455 |
+
|
| 456 |
+
For CPU inference and edge deployment:
|
| 457 |
|
| 458 |
```bash
|
| 459 |
+
# Download GGUF quantized version
|
|
|
|
| 460 |
wget https://huggingface.co/DeepXR/Helion-V2-GGUF/resolve/main/helion-v2-q4_k_m.gguf
|
| 461 |
+
|
| 462 |
+
# Run with llama.cpp
|
| 463 |
+
./llama-cli -m helion-v2-q4_k_m.gguf -p "Your prompt here" -n 256
|
| 464 |
```
|
| 465 |
|
| 466 |
+
---
|
| 467 |
+
|
| 468 |
## Training Details
|
| 469 |
|
| 470 |
+
### Training Data Composition
|
| 471 |
+
|
| 472 |
+
| Data Source | Percentage | Tokens | Description |
|
| 473 |
+
|------------|------------|--------|-------------|
|
| 474 |
+
| Web Documents | 45% | 1.125T | High-quality web pages, articles, documentation |
|
| 475 |
+
| Code Repositories | 20% | 500B | GitHub, Stack Overflow, technical forums |
|
| 476 |
+
| Books | 15% | 375B | Fiction, non-fiction, educational materials |
|
| 477 |
+
| Scientific Papers | 10% | 250B | ArXiv, PubMed, academic publications |
|
| 478 |
+
| Instruction Data | 10% | 250B | Curated instruction-response pairs |
|
| 479 |
+
|
| 480 |
+
**Total Training Tokens**: 2.5 trillion
|
| 481 |
+
|
| 482 |
+
### Data Processing Pipeline
|
| 483 |
+
|
| 484 |
+
1. **Collection**: Scraped from verified sources with license compliance
|
| 485 |
+
2. **Quality Filtering**: Perplexity-based filtering (threshold: 2000)
|
| 486 |
+
3. **Deduplication**: MinHash LSH for near-duplicate removal (>95% similarity)
|
| 487 |
+
4. **Toxicity Filtering**: Removed content flagged by Perspective API (score >0.7)
|
| 488 |
+
5. **PII Removal**: Named entity recognition and regex-based scrubbing
|
| 489 |
+
6. **Language Detection**: Filtered for 13 target languages
|
| 490 |
+
7. **Code Quality**: AST validation, syntax checking, license verification
|
| 491 |
+
|
| 492 |
+
### Training Hyperparameters
|
| 493 |
+
|
| 494 |
+
| Parameter | Value |
|
| 495 |
+
|-----------|-------|
|
| 496 |
+
| Optimizer | AdamW |
|
| 497 |
+
| Peak Learning Rate | 3e-4 |
|
| 498 |
+
| Learning Rate Schedule | Cosine with warmup |
|
| 499 |
+
| Warmup Steps | 2,000 |
|
| 500 |
+
| Weight Decay | 0.01 |
|
| 501 |
+
| Gradient Clipping | 1.0 |
|
| 502 |
+
| Batch Size | 4M tokens |
|
| 503 |
+
| Sequence Length | 8,192 tokens |
|
| 504 |
+
| Training Steps | 600,000 |
|
| 505 |
+
| Epochs | 3 |
|
| 506 |
+
| Precision | BFloat16 |
|
| 507 |
+
| Beta1 | 0.9 |
|
| 508 |
+
| Beta2 | 0.95 |
|
| 509 |
+
| Epsilon | 1e-8 |
|
| 510 |
+
|
| 511 |
+
### Infrastructure
|
| 512 |
+
|
| 513 |
+
- **GPUs**: 128x NVIDIA H100 80GB (SXM5)
|
| 514 |
+
- **Framework**: PyTorch 2.1.2 with CUDA 12.1
|
| 515 |
+
- **Distributed Training**: DeepSpeed ZeRO-3 with CPU offloading
|
| 516 |
+
- **Mixed Precision**: BFloat16 with gradient scaling
|
| 517 |
+
- **Checkpointing**: Every 1,000 steps (3 checkpoints retained)
|
| 518 |
+
- **Training Duration**: 21 days
|
| 519 |
+
- **Total GPU Hours**: 64,512 hours
|
| 520 |
+
- **Estimated Cost**: $450,000 USD
|
| 521 |
+
|
| 522 |
+
### Post-Training Refinement
|
| 523 |
+
|
| 524 |
+
1. **Supervised Fine-Tuning (SFT)**: 150,000 instruction-response pairs
|
| 525 |
+
2. **Direct Preference Optimization (DPO)**: 50,000 preference pairs
|
| 526 |
+
3. **Safety Fine-Tuning**: 25,000 safety-focused examples
|
| 527 |
+
4. **Evaluation-Driven Refinement**: Iterative improvements based on benchmark performance
|
| 528 |
+
|
| 529 |
+
---
|
| 530 |
|
| 531 |
+
## Limitations
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 532 |
|
| 533 |
+
### Known Limitations
|
| 534 |
|
| 535 |
+
1. **Temporal Knowledge**: Information cutoff at October 2024; no awareness of events after this date
|
| 536 |
+
2. **Hallucination Risk**: May generate plausible but incorrect information (mitigated but not eliminated)
|
| 537 |
+
3. **Context Length**: Performance degrades beyond 6,000 tokens despite 8,192 token capacity
|
| 538 |
+
4. **Mathematical Reasoning**: Struggles with complex multi-step calculations requiring precise arithmetic
|
| 539 |
+
5. **Specialized Domains**: Limited accuracy in highly technical fields (e.g., advanced physics, medicine, law)
|
| 540 |
+
6. **Language Imbalance**: Best performance in English; variable quality in other languages
|
| 541 |
+
7. **Code Debugging**: Better at generation than debugging complex existing codebases
|
| 542 |
+
8. **Long-Term Memory**: No persistent memory across conversations
|
| 543 |
+
9. **Real-Time Information**: Cannot access current data, news, or live information
|
| 544 |
+
10. **Multimodal Understanding**: Text-only model; no image, audio, or video processing
|
| 545 |
|
| 546 |
+
### Ethical Considerations
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 547 |
|
| 548 |
+
**Bias**: Training data may reflect societal biases related to gender, race, culture, geography, and socioeconomic status. Users should validate outputs for fairness.
|
| 549 |
|
| 550 |
+
**Misuse Potential**: Model can be misused for generating misinformation, spam, or harmful content. Implement appropriate safeguards.
|
| 551 |
|
| 552 |
+
**Environmental Impact**: Training consumed significant energy (est. 8,500 kg CO2eq). Consider carbon offset for large-scale deployments.
|
| 553 |
|
| 554 |
+
**Privacy**: Do not input personally identifiable information (PII) or confidential data without encryption and proper handling.
|
|
|
|
|
|
|
|
|
|
|
|
|
| 555 |
|
| 556 |
+
### Use Case Restrictions
|
| 557 |
|
| 558 |
+
**DO NOT USE FOR:**
|
| 559 |
+
- Medical diagnosis or treatment recommendations
|
| 560 |
+
- Legal advice or contractual interpretation
|
| 561 |
+
- Financial investment decisions
|
| 562 |
+
- Safety-critical systems (aviation, automotive, medical devices)
|
| 563 |
+
- Autonomous decision-making without human oversight
|
| 564 |
+
- Generating false identification or credentials
|
| 565 |
- Impersonating individuals or organizations
|
| 566 |
+
- Processing sensitive personal data without consent
|
| 567 |
+
|
| 568 |
+
---
|
| 569 |
|
| 570 |
## Citation
|
| 571 |
|
| 572 |
+
If you use Helion-V2 in your research or applications, please cite:
|
| 573 |
+
|
| 574 |
```bibtex
|
| 575 |
@misc{helion-v2-2024,
|
| 576 |
+
title={Helion-V2: An Efficient and Truthful Large Language Model for Daily Use},
|
| 577 |
author={DeepXR Team},
|
| 578 |
year={2024},
|
| 579 |
+
month={November},
|
| 580 |
publisher={HuggingFace},
|
| 581 |
+
url={https://huggingface.co/DeepXR/Helion-V2},
|
| 582 |
+
note={7.2B parameter decoder-only transformer with grouped query attention}
|
| 583 |
+
}
|
| 584 |
+
```
|
| 585 |
+
|
| 586 |
+
For technical details:
|
| 587 |
+
|
| 588 |
+
```bibtex
|
| 589 |
+
@techreport{helion-v2-technical-2024,
|
| 590 |
+
title={Helion-V2: Technical Report},
|
| 591 |
+
author={DeepXR Research Team},
|
| 592 |
+
institution={DeepXR},
|
| 593 |
+
year={2024},
|
| 594 |
+
type={Technical Report},
|
| 595 |
+
url={https://deepxr.ai/research/helion-v2-technical-report.pdf}
|
| 596 |
}
|
| 597 |
```
|
| 598 |
|
| 599 |
+
---
|
| 600 |
+
|
| 601 |
## License
|
| 602 |
|
| 603 |
+
This model is released under the **Apache License 2.0**. You are free to:
|
| 604 |
+
|
| 605 |
+
- Use commercially
|
| 606 |
+
- Modify and distribute
|
| 607 |
+
- Use privately
|
| 608 |
+
- Use for patent purposes
|
| 609 |
+
|
| 610 |
+
**Conditions:**
|
| 611 |
+
- Include copyright notice
|
| 612 |
+
- Include license copy
|
| 613 |
+
- State changes made
|
| 614 |
+
- Include NOTICE file if present
|
| 615 |
|
| 616 |
+
See [LICENSE](LICENSE) file for complete terms.
|
| 617 |
|
| 618 |
+
---
|
|
|
|
|
|
|
| 619 |
|
| 620 |
## Acknowledgments
|
| 621 |
|
| 622 |
+
We extend our gratitude to:
|
| 623 |
+
|
| 624 |
+
- **Hugging Face** for the Transformers library and model hosting infrastructure
|
| 625 |
+
- **PyTorch Team** for the deep learning framework
|
| 626 |
+
- **DeepSpeed Team** (Microsoft) for distributed training tools
|
| 627 |
+
- **EleutherAI** for evaluation frameworks and benchmarks
|
| 628 |
+
- **Open Source Community** for datasets, tools, and collaborative research
|
| 629 |
+
- **Our Compute Partners** for providing GPU infrastructure
|
| 630 |
+
|
| 631 |
+
Special thanks to researchers whose work influenced this project: LLaMA, Mistral, GPT, PaLM, and countless others advancing open language models.
|
| 632 |
+
|
| 633 |
+
---
|
| 634 |
+
|
| 635 |
+
## Contact and Support
|
| 636 |
+
|
| 637 |
+
- **Issues**: [GitHub Issues](https://github.com/DeepXR/Helion-V2/issues)
|
| 638 |
+
- **Discussions**: [GitHub Discussions](https://github.com/DeepXR/Helion-V2/discussions)
|
| 639 |
+
- **Email**: contact@deepxr.ai
|
| 640 |
+
- **Twitter**: @DeepXR_AI
|
| 641 |
+
- **Discord**: [DeepXR Community](https://discord.gg/deepxr)
|
| 642 |
+
- **Documentation**: [docs.deepxr.ai/helion-v2](https://docs.deepxr.ai/helion-v2)
|
| 643 |
+
|
| 644 |
+
For commercial licensing, enterprise support, or custom fine-tuning services, contact: enterprise@deepxr.ai
|
| 645 |
+
|
| 646 |
+
---
|
| 647 |
+
|
| 648 |
+
<div align="center">
|
| 649 |
+
|
| 650 |
+
**Developed with care by the DeepXR Team**
|
| 651 |
+
|
| 652 |
+
*Building responsible, capable, and accessible AI for everyone*
|
| 653 |
+
|
| 654 |
+
</div>
|