--- license: apache-2.0 datasets: - HuggingFaceFW/fineweb-edu language: - en library_name: transformers tags: - pytorch - causal-lm - text-generation - onner --- # 🚀 RessAI Onner-300m **Onner-300m** (internally `RessAI-Ultra-300M`) is a compact, high-efficiency language model designed for educational reasoning and lightweight deployment. With approximately **200 Million parameters**, it follows a "Dense & Deep" philosophy scaled down for speed and accessibility. It is trained on the high-quality [FineWeb-Edu](https://huggingface.co/datasets/HuggingFaceFW/fineweb-edu) dataset, utilizing a custom architecture (`RessAiForCausalLM`) optimized for efficient inference.
## 🔍 Model Details - **Model Name:** RessAI Onner-300m - **Organization:** RessAI - **Architecture:** `RessAiForCausalLM` - **Model Type:** `onner` - **Parameters:** ~199.9 Million (0.20B) - **Context Window:** 4,096 tokens - **Vocabulary:** 128,256 - **Training Precision:** Bfloat16 - **License:** Apache 2.0 ## 🧠 Technical Specifications This model uses a custom configuration inspired by BERT-base sizing but with Llama's causal attention mechanisms: | Hyperparameter | Value | Description | | :--- | :--- | :--- | | **Hidden Size** | 768 | Embedding dimension (Compact) | | **Layers** | 12 | Network depth | | **Attention Heads** | 12 | Query heads | | **KV Heads** | 2 | Grouped Query Attention (GQA 6:1) | | **Intermediate Size** | 3,072 | MLP Width | | **RoPE Theta** | 500,000 | Rotary Embeddings Base | | **Max Sequence** | 4,096 | Context Length |