Onner-300m / README.md

RessAI

Update README.md

5b5d13f verified 2 months ago

preview code

raw

history blame contribute delete

1.71 kB

metadata

license: apache-2.0
datasets:
  - HuggingFaceFW/fineweb-edu
language:
  - en
library_name: transformers
tags:
  - pytorch
  - causal-lm
  - text-generation
  - onner

🚀 RessAI Onner-300m

Onner-300m (internally RessAI-Ultra-300M) is a compact, high-efficiency language model designed for educational reasoning and lightweight deployment. With approximately 200 Million parameters, it follows a "Dense & Deep" philosophy scaled down for speed and accessibility.

It is trained on the high-quality FineWeb-Edu dataset, utilizing a custom architecture (RessAiForCausalLM) optimized for efficient inference.

🔍 Model Details

Model Name: RessAI Onner-300m
Organization: RessAI
Architecture: RessAiForCausalLM
Model Type: onner
Parameters: ~199.9 Million (0.20B)
Context Window: 4,096 tokens
Vocabulary: 128,256
Training Precision: Bfloat16
License: Apache 2.0

🧠 Technical Specifications

This model uses a custom configuration inspired by BERT-base sizing but with Llama's causal attention mechanisms:

Hyperparameter	Value	Description
Hidden Size	768	Embedding dimension (Compact)
Layers	12	Network depth
Attention Heads	12	Query heads
KV Heads	2	Grouped Query Attention (GQA 6:1)
Intermediate Size	3,072	MLP Width
RoPE Theta	500,000	Rotary Embeddings Base
Max Sequence	4,096	Context Length