Nexus1-124M-v1

Nexus1 is the first prototype in the Nexus LLM series. It is a 124M parameter decoder-only transformer pre-trained from scratch on high-quality educational web data.

Model Details

  • Architecture: GPT-2 style
  • Parameters: 124 Million
  • Context Window: 1024 tokens
  • Vocabulary Size: 50,257 (Custom BPE)
  • Training Stage: Base Pre-training (Complete)

Training Data

Pre-trained on the FineWeb-Edu (10B Sample) dataset, which focuses on high-quality, educational content from the web to ensure better reasoning capabilities in a small model.

Purpose

Nexus1 serves as the proof-of-concept for the Nexus Training Pipeline. It is intended to be used as a reference model for knowledge distillation into the larger Nexus2-1B model.

How to use

from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("mnnobi/Nexus1-124M-v1")
tokenizer = AutoTokenizer.from_pretrained("mnnobi/Nexus1-124M-v1")
Downloads last month
4
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train mnnobi/Nexus1-124M-v1