---
language: en
license: apache-2.0
tags:
- causal-lm
- pretraining
- research
- from-scratch
---

# HelionX Base 300M

HelionX Base 300M is a **from-scratch pretrained causal language model** developed as part of the HelionX research initiative.

## Model Details

- **Architecture:** Decoder-only Transformer
- **Parameters:** ~300M
- **Layers:** 22
- **Hidden size:** 896
- **Attention heads:** 14
- **Context length:** 2048 tokens
- **Tokenizer:** GPT-2 BPE (50257 vocab)
- **Precision:** FP16 training
- **Training tokens:** 300M tokens
- **Training data:** OpenWebText (streamed)

## Training

The model was trained incrementally and resumed from intermediate checkpoints, completing a full **300M-token pretraining run** using mixed-precision training and gradient checkpointing.

Training infrastructure included:
- Modal (A100 40GB)
- PyTorch
- Hugging Face tooling

## Intended Use

- Research
- Continued pretraining
- Fine-tuning
- Architecture experiments

## Limitations

This is a base model and **not instruction-tuned**. Outputs may be incoherent or unsafe without further alignment.

## License

Apache 2.0