File size: 7,553 Bytes
7139490 df07e20 7139490 df07e20 7139490 186afa1 7139490 df07e20 186afa1 df07e20 7139490 df07e20 7139490 df07e20 7139490 095d29d 7139490 df07e20 fb9330f df07e20 7139490 3214eee 56ccbf1 7139490 df07e20 7139490 df07e20 7139490 df07e20 7139490 df07e20 7139490 df07e20 7139490 df07e20 7139490 df07e20 7139490 df07e20 7139490 df07e20 7139490 095d29d 7139490 095d29d 7139490 095d29d 7139490 095d29d 7139490 095d29d 7139490 095d29d 7139490 095d29d 7139490 095d29d 7139490 095d29d 7139490 095d29d 7139490 095d29d 7139490 095d29d 7139490 095d29d 7139490 095d29d 7139490 095d29d 7139490 095d29d 7139490 095d29d 7139490 095d29d 7139490 095d29d 7139490 186afa1 7139490 186afa1 7139490 186afa1 095d29d 7139490 df07e20 7139490 df07e20 095d29d df07e20 7139490 df07e20 c01fb0b 7139490 df07e20 095d29d 26ee1fc |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 |
---
license: apache-2.0
base_model: meta-llama/Llama-2-7b-hf
tags:
- text-generation
- conversational
- llama-2
- autotrain_compatible
- function-calling
language:
- en
pipeline_tag: text-generation
library_name: transformers
model-index:
- name: Helion-V1.5
results:
- task:
type: text-generation
name: Text Generation
dataset:
name: MT-Bench
type: mt-bench
metrics:
- type: score
value: 7.2
name: MT-Bench Score
- task:
type: text-generation
name: Conversational
dataset:
name: AlpacaEval
type: alpaca-eval
metrics:
- type: win_rate
value: 78.5
name: Win Rate %
- task:
type: text-generation
name: Code Generation
dataset:
name: HumanEval
type: humaneval
metrics:
- type: pass@1
value: 42.3
name: Pass@1
widget:
- text: "Explain the difference between machine learning and deep learning"
example_title: "Technical Explanation"
- text: "Write a Python function to calculate fibonacci numbers"
example_title: "Code Generation"
---
<div align="center">
<img src="https://imgur.com/aUIJXf7.png" alt="Helion-V1 Logo" width="100%"/>
</div>
---
# Helion-V1.5
**Helion-V1.5** is a 7B parameter conversational AI model fine-tuned from Llama-2 using QLoRA. It delivers improved performance over Helion-V1 with enhanced instruction following, code generation, and multi-turn dialogue capabilities.
## Model Details
**Architecture:** Llama-2-7B with LoRA adapters
**Parameters:** 7 billion (base) + 67M (LoRA)
**Context Length:** 4096 tokens
**Training:** QLoRA (4-bit) fine-tuning on high-quality instruction data
**License:** Apache 2.0
### Key Improvements over Helion-V1
| Feature | Helion-V1 | Helion-V1.5 | Improvement |
|---------|-----------|-------------|-------------|
| **MT-Bench Score** | 6.8 | 7.2 | +5.9% |
| **AlpacaEval Win Rate** | 72.3% | 78.5% | +8.6% |
| **HumanEval Pass@1** | 38.1% | 42.3% | +11.0% |
| **Avg Response Time** | 2.3s | 1.8s | -21.7% |
| **Function Calling** | ❌ | ✅ | New |
| **Streaming Support** | Basic | Full | Enhanced |
### Technical Specifications
| Component | Value |
|-----------|-------|
| Hidden Size | 4096 |
| Layers | 32 |
| Attention Heads | 32 |
| Intermediate Size | 11008 |
| Vocabulary | 32000 tokens |
| Position Encoding | RoPE |
| Precision | bfloat16 |
**LoRA Configuration:**
- Rank: 64
- Alpha: 128
- Target Modules: All linear layers (q,k,v,o,gate,up,down)
- Dropout: 0.05
## Performance Benchmarks
| Benchmark | Score | Category |
|-----------|-------|----------|
| MT-Bench | 7.2/10 | Multi-turn conversation |
| AlpacaEval | 78.5% | Instruction following |
| HumanEval | 42.3% | Code generation |
| GSM8K | 35.7% | Mathematical reasoning |
| TruthfulQA | 51.2% | Factual accuracy |
| MMLU | 48.9% | Knowledge |
## How to Use
### Quick Start
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
# Load model and tokenizer
model_name = "DeepXR/Helion-V1.5"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.bfloat16,
device_map="auto"
)
# Prepare messages
messages = [
{"role": "user", "content": "Explain machine learning in simple terms"}
]
# Apply chat template
input_ids = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
return_tensors="pt"
).to(model.device)
# Generate response
output = model.generate(
input_ids,
max_new_tokens=512,
temperature=0.7,
top_p=0.9,
do_sample=True
)
response = tokenizer.decode(output[0][input_ids.shape[1]:], skip_special_tokens=True)
print(response)
```
### Using with Text Generation Inference (TGI)
```bash
docker run --gpus all --shm-size 1g -p 8080:80 \
ghcr.io/huggingface/text-generation-inference:latest \
--model-id DeepXR/Helion-V1.5 \
--max-input-length 3584 \
--max-total-tokens 4096
```
### Using with vLLM
```python
from vllm import LLM, SamplingParams
llm = LLM(model="DeepXR/Helion-V1.5")
sampling_params = SamplingParams(temperature=0.7, top_p=0.9, max_tokens=512)
prompts = ["Explain quantum computing"]
outputs = llm.generate(prompts, sampling_params)
for output in outputs:
print(output.outputs[0].text)
```
### Using with LangChain
```python
from langchain.llms import HuggingFacePipeline
from transformers import pipeline
pipe = pipeline(
"text-generation",
model="DeepXR/Helion-V1.5",
max_new_tokens=512
)
llm = HuggingFacePipeline(pipeline=pipe)
response = llm("What is artificial intelligence?")
```
## Training Data
### Dataset Composition
The model was trained on a curated dataset including:
- **Conversational Data** (40%): Multi-turn dialogues focusing on helpfulness
- **Instruction Following** (30%): Task completion and instruction adherence
- **Safety Examples** (15%): Refusal training for harmful requests
- **Domain-Specific** (15%): Programming, writing, analysis tasks
**Total Training Examples:** ~50,000
**Data Quality:** High-quality, manually filtered and safety-checked
### Data Processing
- Deduplication using MinHash
- Safety filtering for harmful content
- Quality scoring and filtering (score > 0.7)
- Format standardization to chat template
- Context length trimming (max 4096 tokens)
## Evaluation
### Benchmark Results
| Benchmark | Score | Description |
|-----------|-------|-------------|
| **MT-Bench** | 7.2/10 | Multi-turn conversation quality |
| **AlpacaEval** | 78.5% | Win rate vs. text-davinci-003 |
| **HumanEval** | 42.3% | Python code generation (pass@1) |
| **GSM8K** | 35.7% | Math word problems |
| **TruthfulQA** | 51.2% | Truthfulness in answers |
| **MMLU** | 48.9% | Multi-task language understanding |
## Capabilities
### Advanced Features
- **Function Calling**: Supports structured function/tool calling
- **Code Execution**: Can generate and explain code across multiple languages
- **Multi-turn Context**: Maintains conversation context up to 4096 tokens
- **Streaming Support**: Compatible with streaming inference
- **Batch Processing**: Efficient batch generation support
- **Custom System Prompts**: Flexible system message configuration
## Limitations
### Known Limitations
1. **Knowledge Cutoff:** Training data up to April 2023
2. **Hallucinations:** May generate plausible but incorrect information
3. **Context Limitations:** 4096 token context window
4. **Math Reasoning:** Struggles with complex multi-step calculations
5. **Multilingual:** Primarily English, limited other languages
6. **Temporal Reasoning:** May not accurately understand time-sensitive queries
7. **Factual Accuracy:** Not suitable as sole source of truth
### Bias and Fairness
The model may exhibit biases present in the training data. We've implemented:
- Bias evaluation across demographic groups
- Regular fairness audits
- User feedback integration
- Ongoing bias mitigation efforts
## Responsible Use
Users should:
- Verify critical information from authoritative sources
- Implement appropriate safeguards for production use
- Monitor outputs for accuracy and appropriateness
- Comply with applicable laws and regulations
- Provide proper attribution for AI-generated content
## Citation
```bibtex
@misc{helion-v1.5-2024,
author = {DeepXR},
title = {Helion-V1.5: Enhanced Conversational AI},
year = {2025},
publisher = {HuggingFace},
url = {https://huggingface.co/DeepXR/Helion-V1.5}
}
```
---
**Model Version:** 1.5.0 | **Release:** December 2025 |