Onner-300m / README.md
RessAI's picture
Update README.md
5b5d13f verified
metadata
license: apache-2.0
datasets:
  - HuggingFaceFW/fineweb-edu
language:
  - en
library_name: transformers
tags:
  - pytorch
  - causal-lm
  - text-generation
  - onner

πŸš€ RessAI Onner-300m

Onner-300m (internally RessAI-Ultra-300M) is a compact, high-efficiency language model designed for educational reasoning and lightweight deployment. With approximately 200 Million parameters, it follows a "Dense & Deep" philosophy scaled down for speed and accessibility.

It is trained on the high-quality FineWeb-Edu dataset, utilizing a custom architecture (RessAiForCausalLM) optimized for efficient inference.

πŸ” Model Details

  • Model Name: RessAI Onner-300m
  • Organization: RessAI
  • Architecture: RessAiForCausalLM
  • Model Type: onner
  • Parameters: ~199.9 Million (0.20B)
  • Context Window: 4,096 tokens
  • Vocabulary: 128,256
  • Training Precision: Bfloat16
  • License: Apache 2.0

🧠 Technical Specifications

This model uses a custom configuration inspired by BERT-base sizing but with Llama's causal attention mechanisms:

Hyperparameter Value Description
Hidden Size 768 Embedding dimension (Compact)
Layers 12 Network depth
Attention Heads 12 Query heads
KV Heads 2 Grouped Query Attention (GQA 6:1)
Intermediate Size 3,072 MLP Width
RoPE Theta 500,000 Rotary Embeddings Base
Max Sequence 4,096 Context Length