Trouter-Library commited on
Commit
33ccf4b
·
verified ·
1 Parent(s): 4800535

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +185 -3
README.md CHANGED
@@ -1,3 +1,185 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Helion-V2
2
+
3
+ Helion-V2 is a state-of-the-art large language model designed for daily use, delivering intelligent and contextually aware responses across diverse tasks including reasoning, coding, creative writing, and general knowledge.
4
+
5
+ ## Model Details
6
+
7
+ **Model Type:** Causal Language Model (Transformer-based)
8
+ **Architecture:** Decoder-only transformer with optimized attention mechanisms
9
+ **Parameters:** 7.2 billion
10
+ **Context Length:** 8,192 tokens
11
+ **Training Data Cutoff:** October 2025
12
+ **License:** Apache 2.0
13
+ **Developed by:** DeepXR
14
+
15
+ ### Key Features
16
+
17
+ - High-quality reasoning and problem-solving capabilities
18
+ - Strong performance on coding tasks with multi-language support
19
+ - Enhanced instruction following and conversational ability
20
+ - Efficient inference suitable for consumer hardware
21
+ - Fine-tuned for factual accuracy and reduced hallucinations
22
+
23
+ ## Performance Benchmarks
24
+
25
+ Helion-V2 demonstrates competitive performance against leading open-source models in its parameter class:
26
+
27
+ | Benchmark | Helion-V2 | Llama-3-8B | Mistral-7B | Gemma-7B | Qwen-2-7B |
28
+ |-----------|-----------|------------|------------|----------|-----------|
29
+ | **MMLU** (5-shot) | 64.2 | 66.4 | 62.5 | 64.3 | 65.1 |
30
+ | **HellaSwag** (10-shot) | 80.5 | 82.1 | 81.3 | 80.9 | 81.7 |
31
+ | **ARC-Challenge** (25-shot) | 58.3 | 59.2 | 56.7 | 57.9 | 58.8 |
32
+ | **TruthfulQA** (MC2) | 52.1 | 48.3 | 47.6 | 49.2 | 51.3 |
33
+ | **GSM8K** (8-shot CoT) | 68.7 | 72.4 | 52.3 | 66.1 | 71.8 |
34
+ | **HumanEval** (pass@1) | 48.2 | 51.8 | 40.2 | 44.5 | 49.7 |
35
+ | **MT-Bench** (Avg) | 7.85 | 8.12 | 7.61 | 7.73 | 7.92 |
36
+ | **AlpacaEval 2.0** (Win Rate) | 18.3 | 22.1 | 14.7 | 16.8 | 19.4 |
37
+
38
+ **Strengths:**
39
+ - Exceptional truthfulness and factual accuracy (TruthfulQA)
40
+ - Strong multi-turn conversational ability (MT-Bench)
41
+ - Balanced performance across reasoning and knowledge tasks
42
+ - Optimized for practical, everyday use cases
43
+
44
+ ## Usage
45
+
46
+ ### Installation
47
+
48
+ ```bash
49
+ pip install transformers torch accelerate
50
+ ```
51
+
52
+ ### Basic Inference
53
+
54
+ ```python
55
+ from transformers import AutoTokenizer, AutoModelForCausalLM
56
+ import torch
57
+
58
+ model_name = "DeepXR/Helion-V2"
59
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
60
+ model = AutoModelForCausalLM.from_pretrained(
61
+ model_name,
62
+ torch_dtype=torch.float16,
63
+ device_map="auto"
64
+ )
65
+
66
+ prompt = "Explain quantum entanglement in simple terms:"
67
+ inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
68
+
69
+ outputs = model.generate(
70
+ **inputs,
71
+ max_new_tokens=256,
72
+ temperature=0.7,
73
+ top_p=0.9,
74
+ do_sample=True
75
+ )
76
+
77
+ response = tokenizer.decode(outputs[0], skip_special_tokens=True)
78
+ print(response)
79
+ ```
80
+
81
+ ### Chat Template
82
+
83
+ ```python
84
+ messages = [
85
+ {"role": "system", "content": "You are a helpful AI assistant."},
86
+ {"role": "user", "content": "What is the capital of France?"}
87
+ ]
88
+
89
+ input_text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
90
+ inputs = tokenizer(input_text, return_tensors="pt").to(model.device)
91
+
92
+ outputs = model.generate(**inputs, max_new_tokens=150)
93
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
94
+ ```
95
+
96
+ ## Quantization
97
+
98
+ For efficient deployment on consumer hardware:
99
+
100
+ ### 4-bit Quantization (GPTQ/AWQ)
101
+
102
+ ```python
103
+ from transformers import AutoTokenizer, AutoModelForCausalLM
104
+
105
+ model = AutoModelForCausalLM.from_pretrained(
106
+ "DeepXR/Helion-V2",
107
+ load_in_4bit=True,
108
+ device_map="auto"
109
+ )
110
+ ```
111
+
112
+ ### GGUF (llama.cpp)
113
+
114
+ ```bash
115
+ # Download quantized GGUF models
116
+ # Q4_K_M recommended for best quality/size balance
117
+ wget https://huggingface.co/DeepXR/Helion-V2-GGUF/resolve/main/helion-v2-q4_k_m.gguf
118
+ ```
119
+
120
+ ## Training Details
121
+
122
+ ### Training Data
123
+
124
+ Helion-V2 was trained on a diverse corpus including:
125
+ - High-quality web documents and articles
126
+ - Scientific papers and technical documentation
127
+ - Code repositories from multiple programming languages
128
+ - Books and educational materials
129
+ - Instruction-following datasets with human feedback
130
+
131
+ Total training tokens: approximately 2.5 trillion
132
+
133
+ ### Training Procedure
134
+
135
+ - **Framework:** PyTorch with DeepSpeed ZeRO-3
136
+ - **Optimizer:** AdamW with cosine learning rate schedule
137
+ - **Peak Learning Rate:** 3e-4
138
+ - **Batch Size:** 4M tokens per batch
139
+ - **Training Duration:** 3 epochs over filtered dataset
140
+ - **Hardware:** 128x NVIDIA H100 GPUs
141
+
142
+ ### Instruction Tuning
143
+
144
+ Post-training supervised fine-tuning on 150K high-quality instruction-response pairs, followed by direct preference optimization (DPO) using human preference data.
145
+
146
+ ## Limitations
147
+
148
+ - Knowledge cutoff at October 2024; may not reflect recent events
149
+ - Can occasionally generate incorrect or nonsensical information
150
+ - May struggle with highly specialized technical or domain-specific queries
151
+ - Performance degrades with very long contexts (>6K tokens)
152
+ - Not specifically trained for safety; may require additional guardrails for production
153
+
154
+ ## Ethical Considerations
155
+
156
+ Users should be aware of potential biases in model outputs and verify critical information from authoritative sources. This model should not be used for:
157
+ - Making medical, legal, or financial decisions without expert consultation
158
+ - Generating harmful, misleading, or malicious content
159
+ - Impersonating individuals or organizations
160
+
161
+ ## Citation
162
+
163
+ ```bibtex
164
+ @misc{helion-v2-2024,
165
+ title={Helion-V2: An Efficient Large Language Model for Daily Use},
166
+ author={DeepXR Team},
167
+ year={2024},
168
+ publisher={HuggingFace},
169
+ url={https://huggingface.co/DeepXR/Helion-V2}
170
+ }
171
+ ```
172
+
173
+ ## License
174
+
175
+ This model is released under the Apache 2.0 License. See LICENSE file for details.
176
+
177
+ ## Contact
178
+
179
+ For questions, issues, or collaboration inquiries:
180
+ - GitHub Issues: https://github.com/DeepXR/Helion-V2/issues
181
+ - Email: contact@deepxr.ai
182
+
183
+ ## Acknowledgments
184
+
185
+ We thank the open-source community for tools and frameworks that made this work possible, including Hugging Face Transformers, PyTorch, and DeepSpeed.