Helion-V2.5-Rnd / README.md
Trouter-Library's picture
Update README.md
9bcba4a verified
|
raw
history blame
5.96 kB
metadata
license: apache-2.0

Helion-2.5-Rnd

DeepXR/Helion-2.5-Rnd - Advanced Research & Development Language Model

Overview

Helion-2.5-Rnd is a cutting-edge research language model designed for exceptional performance across multiple domains including:

  • Advanced Reasoning: Complex problem-solving and logical deduction
  • Code Generation: Multi-language programming assistance
  • Mathematical Computation: Proof generation and symbolic mathematics
  • Multilingual Understanding: 50+ languages with cultural context
  • Creative Writing: Story generation, poetry, and content creation
  • Scientific Analysis: Research paper understanding and synthesis
  • Long Context: Up to 131K tokens of context window

Model Architecture

  • Type: Transformer-based causal language model
  • Parameters: 70B+ parameters
  • Architecture: LLaMA-based with YARN positional embeddings
  • Context Window: 131,072 tokens (128K)
  • Precision: BF16/FP16 with INT8/INT4 quantization support
  • Training Data: 2.5 trillion tokens across diverse domains

Quick Start

Installation

# Clone the repository
git clone https://huggingface.co/DeepXR/Helion-2.5-Rnd
cd Helion-2.5-Rnd

# Install dependencies
pip install -r requirements.txt

# Or use Docker
docker build -t helion:2.5-rnd .

Running the Server

Using Python

python -m inference.server \
    --model /path/to/model \
    --tensor-parallel-size 2 \
    --max-model-len 131072 \
    --gpu-memory-utilization 0.95

Using Docker

docker run -d \
    --gpus all \
    -p 8000:8000 \
    -v /path/to/model:/models/helion \
    -e MODEL_PATH=/models/helion \
    -e TENSOR_PARALLEL_SIZE=2 \
    helion:2.5-rnd

Using the Client

from inference.client import HelionClient, HelionAssistant

# Basic client
client = HelionClient(base_url="http://localhost:8000")

# Simple completion
response = client.complete(
    "Explain quantum entanglement:",
    temperature=0.7,
    max_tokens=500
)

# Chat interface
messages = [
    {"role": "system", "content": "You are a helpful AI assistant."},
    {"role": "user", "content": "What is machine learning?"}
]
response = client.chat(messages=messages)

# High-level assistant
assistant = HelionAssistant()
response = assistant.chat("Write a Python function for quicksort")

API Endpoints

Chat Completions

curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "DeepXR/Helion-2.5-Rnd",
    "messages": [
      {"role": "user", "content": "Hello, how are you?"}
    ],
    "temperature": 0.7,
    "max_tokens": 1000
  }'

Text Completions

curl -X POST http://localhost:8000/v1/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "DeepXR/Helion-2.5-Rnd",
    "prompt": "Once upon a time",
    "temperature": 0.8,
    "max_tokens": 500
  }'

Health Check

curl http://localhost:8000/health

Configuration

Model Parameters

See model_config.yaml for full configuration options:

  • Temperature: 0.0-2.0 (default: 0.7)
  • Top-p: 0.0-1.0 (default: 0.9)
  • Top-k: Integer (default: 50)
  • Max Tokens: 1-131072 (default: 4096)
  • Repetition Penalty: 1.0-2.0 (default: 1.1)

Hardware Requirements

Minimum:

  • 2x NVIDIA A100 80GB GPUs
  • 256GB RAM
  • 500GB NVMe SSD

Recommended:

  • 4x NVIDIA H100 80GB GPUs
  • 512GB RAM
  • 1TB NVMe SSD

Capabilities

Code Generation

messages = [
    {"role": "user", "content": "Write a binary search tree implementation in Rust"}
]
response = client.chat(messages=messages, temperature=0.3)

Mathematical Reasoning

response = client.complete(
    "Prove that the square root of 2 is irrational using contradiction:",
    temperature=0.5
)

Creative Writing

response = client.complete(
    "Write a haiku about artificial intelligence:",
    temperature=0.9
)

Multilingual Support

Helion supports 50+ languages including:

  • English, Spanish, French, German, Italian
  • Chinese (Simplified & Traditional), Japanese, Korean
  • Arabic, Hebrew, Hindi, Russian
  • And many more...

Benchmarks

Benchmark Score
MMLU 84.7%
GSM8K 89.2%
HumanEval 75.6%
MBPP 72.3%
ARC Challenge 83.4%
HellaSwag 88.9%
TruthfulQA 61.2%

Safety and Limitations

Safety Features

  • Content filtering for harmful outputs
  • PII (Personally Identifiable Information) detection
  • Prompt injection protection
  • Toxicity thresholds

Known Limitations

  • This is a research model - outputs should be verified
  • May exhibit biases present in training data
  • Performance on highly specialized domains may vary
  • Long context (>64K tokens) performance degrades
  • Not suitable for production without further fine-tuning

Research Use

This model is intended for research and development purposes. It represents an experimental version of the Helion architecture and is continuously being improved.

Citation

If you use this model in your research, please cite:

@misc{helion-2.5-rnd,
  title={Helion-2.5-Rnd: Advanced Research Language Model},
  author={DeepXR Team},
  year={2025},
  publisher={DeepXR},
  url={https://huggingface.co/DeepXR/Helion-2.5-Rnd}
}

License

This model is released under the Apache License 2.0. See LICENSE for full details.

Support

  • Documentation: See docs/ directory
  • Issues: Report on GitHub Issues
  • Community: Join our Discord/Slack
  • Email: support@deepxr.ai

Acknowledgments

Built upon the excellent work of:

  • Meta AI (LLaMA architecture)
  • Hugging Face (Transformers library)
  • vLLM team (High-performance inference)
  • The open-source AI community

DeepXR - Advancing AI Research

Version: 2.5.0-rnd | Status: Research | Updated: 2025-01-30