RessAI
/

Onner-300m

@@ -11,70 +11,72 @@ tags:
 - text-generation
 - onner
 ---
-# 🚀 RessAI-Ultra 2B
-**RessAI-Ultra 2B** is a custom 2.56 Billion parameter language model built on the highly optimized `onner` architecture. Designed for deep reasoning and long-context understanding, it features a 128k context window and a "Deep & Dense" layer structure.
 <div align="center">
-  <img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers_logo_name.png" width="300"/>
 </div>
 ## 🔍 Model Details
-- **Model Name:** RessAI-Ultra 2B
 - **Organization:** RessAI
-- **Architecture:** `RessAiForCausalLM` (Custom Llama-based structure)
 - **Model Type:** `onner`
-- **Parameters:** ~2.56 Billion
-- **Context Window:** 131,072 tokens (128k)
 - **Training Precision:** Bfloat16
 - **License:** Apache 2.0
 ## 🧠 Technical Specifications
-RessAI-Ultra utilizes a custom configuration designed for efficiency and long-range dependencies:
 | Hyperparameter | Value | Description |
 | :--- | :--- | :--- |
-| **Hidden Size** | 2560 | Custom embedding dimension |
-| **Layers** | 32 | Deep network structure |
-| **Attention Heads** | 32 | Standard query heads |
-| **KV Heads** | 4 | Grouped Query Attention (GQA) 8:1 Ratio |
-| **Intermediate Size** | 7168 | Wide MLP for high capacity |
-| **RoPE Theta** | 2,000,000 | Enhanced for long context stability |
-| **Vocab Size** | 128,256 | Llama-3 Tokenizer compatibility |
 ## 💻 Usage
-Because this model uses a custom architecture type (`onner`) and configuration (`RessAiConfig`), you can load it using the standard `transformers` library.
-### Python Code
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer
 import torch
-model_id = "RessAI/RessAI-Ultra"
-# Load Tokenizer
 tokenizer = AutoTokenizer.from_pretrained(model_id)
-# Load Model
-# Note: Ensure you have the latest transformers version
 model = AutoModelForCausalLM.from_pretrained(
     model_id,
-    torch_dtype=torch.bfloat16,
     device_map="auto",
-    trust_remote_code=True # Required for custom config/arch if code is present
 )
-# Inference
 prompt = "The future of artificial intelligence is"
 inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
 outputs = model.generate(
     **inputs,
-    max_new_tokens=100,
     temperature=0.7,
     top_p=0.9,
     do_sample=True

 - text-generation
 - onner
 ---
+# 🚀 RessAI Onner-300m
+**Onner-300m** (internally `RessAI-Ultra-300M`) is a compact, high-efficiency language model designed for educational reasoning and lightweight deployment. With approximately **200 Million parameters**, it follows a "Dense & Deep" philosophy scaled down for speed and accessibility.
+It is trained on the high-quality [FineWeb-Edu](https://huggingface.co/datasets/HuggingFaceFW/fineweb-edu) dataset, utilizing a custom architecture (`RessAiForCausalLM`) optimized for efficient inference.
 <div align="center">
+  <img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers_logo_name.png" width="200"/>
 </div>
 ## 🔍 Model Details
+- **Model Name:** RessAI Onner-300m
 - **Organization:** RessAI
+- **Architecture:** `RessAiForCausalLM` (Custom Llama-style structure)
 - **Model Type:** `onner`
+- **Parameters:** ~199.9 Million (0.20B)
+- **Context Window:** 4,096 tokens
+- **Vocabulary:** 128,256 (Llama-3 Compatible)
 - **Training Precision:** Bfloat16
 - **License:** Apache 2.0
 ## 🧠 Technical Specifications
+This model uses a custom configuration inspired by BERT-base sizing but with Llama's causal attention mechanisms:
 | Hyperparameter | Value | Description |
 | :--- | :--- | :--- |
+| **Hidden Size** | 768 | Embedding dimension (Compact) |
+| **Layers** | 12 | Network depth |
+| **Attention Heads** | 12 | Query heads |
+| **KV Heads** | 2 | Grouped Query Attention (GQA 6:1) |
+| **Intermediate Size** | 3,072 | MLP Width |
+| **RoPE Theta** | 500,000 | Rotary Embeddings Base |
+| **Max Sequence** | 4,096 | Context Length |
 ## 💻 Usage
+### Python Code (Transformers)
+Since this model uses a custom architecture configuration (`onner`), ensure you have `transformers` installed.
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer
 import torch
+model_id = "RessAI/Onner-300m"
+# 1. Load Tokenizer
 tokenizer = AutoTokenizer.from_pretrained(model_id)
+# 2. Load Model
 model = AutoModelForCausalLM.from_pretrained(
     model_id,
+    torch_dtype=torch.bfloat16, # Use float16 if bfloat16 not supported
     device_map="auto",
+    trust_remote_code=True
 )
+# 3. Inference
 prompt = "The future of artificial intelligence is"
 inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
 outputs = model.generate(
     **inputs,
+    max_new_tokens=50,
     temperature=0.7,
     top_p=0.9,
     do_sample=True