RessAI
/

Onner-300m

+---
+license: apache-2.0
+datasets:
+- HuggingFaceFW/fineweb-edu
+language:
+- en
+library_name: transformers
+tags:
+- pytorch
+- causal-lm
+- text-generation
+- onner
+---
+# 🚀 RessAI-Ultra 2B
+**RessAI-Ultra 2B** is a custom 2.56 Billion parameter language model built on the highly optimized `onner` architecture. Designed for deep reasoning and long-context understanding, it features a 128k context window and a "Deep & Dense" layer structure.
+<div align="center">
+  <img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers_logo_name.png" width="300"/>
+</div>
+## 🔍 Model Details
+- **Model Name:** RessAI-Ultra 2B
+- **Organization:** RessAI
+- **Architecture:** `RessAiForCausalLM` (Custom Llama-based structure)
+- **Model Type:** `onner`
+- **Parameters:** ~2.56 Billion
+- **Context Window:** 131,072 tokens (128k)
+- **Training Precision:** Bfloat16
+- **License:** Apache 2.0
+## 🧠 Technical Specifications
+RessAI-Ultra utilizes a custom configuration designed for efficiency and long-range dependencies:
+| Hyperparameter | Value | Description |
+| :--- | :--- | :--- |
+| **Hidden Size** | 2560 | Custom embedding dimension |
+| **Layers** | 32 | Deep network structure |
+| **Attention Heads** | 32 | Standard query heads |
+| **KV Heads** | 4 | Grouped Query Attention (GQA) 8:1 Ratio |
+| **Intermediate Size** | 7168 | Wide MLP for high capacity |
+| **RoPE Theta** | 2,000,000 | Enhanced for long context stability |
+| **Vocab Size** | 128,256 | Llama-3 Tokenizer compatibility |
+## 💻 Usage
+Because this model uses a custom architecture type (`onner`) and configuration (`RessAiConfig`), you can load it using the standard `transformers` library.
+### Python Code
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+import torch
+model_id = "RessAI/RessAI-Ultra"
+# Load Tokenizer
+tokenizer = AutoTokenizer.from_pretrained(model_id)
+# Load Model
+# Note: Ensure you have the latest transformers version
+model = AutoModelForCausalLM.from_pretrained(
+    model_id,
+    torch_dtype=torch.bfloat16,
+    device_map="auto",
+    trust_remote_code=True # Required for custom config/arch if code is present
+)
+# Inference
+prompt = "The future of artificial intelligence is"
+inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
+outputs = model.generate(
+    **inputs,
+    max_new_tokens=100,
+    temperature=0.7,
+    top_p=0.9,
+    do_sample=True
+)
+print(tokenizer.decode(outputs[0], skip_special_tokens=True))