Text Generation
Transformers
Safetensors
English
gpt
causal-lm
decoder-only
grouped-query-attention
rope
swiglu
custom-tokenizer
curriculum-learning
xsa
custom_code
Instructions to use UniversalComputingResearch/Atom3.4m with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use UniversalComputingResearch/Atom3.4m with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="UniversalComputingResearch/Atom3.4m", trust_remote_code=True)# Load model directly from transformers import AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained("UniversalComputingResearch/Atom3.4m", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use UniversalComputingResearch/Atom3.4m with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "UniversalComputingResearch/Atom3.4m" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "UniversalComputingResearch/Atom3.4m", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/UniversalComputingResearch/Atom3.4m
- SGLang
How to use UniversalComputingResearch/Atom3.4m with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "UniversalComputingResearch/Atom3.4m" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "UniversalComputingResearch/Atom3.4m", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "UniversalComputingResearch/Atom3.4m" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "UniversalComputingResearch/Atom3.4m", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use UniversalComputingResearch/Atom3.4m with Docker Model Runner:
docker model run hf.co/UniversalComputingResearch/Atom3.4m
File size: 4,290 Bytes
bdb11fe | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 | ---
license: apache-2.0
datasets:
- HuggingFaceFW/fineweb-edu
- openbmb/Ultra-FineWeb
- HuggingFaceTB/finemath
- HuggingFaceTB/smollm-corpus
- openbmb/UltraData-Math
language:
- en
library_name: transformers
tags:
- causal-lm
- decoder-only
- grouped-query-attention
- rope
- swiglu
- custom-tokenizer
- curriculum-learning
- xsa
pipeline_tag: text-generation
---

# Atom 3.4m
Atom is a 3.4M parameter causal language model developed by **Universal Computing Research**. It was pretrained from scratch as a compact research model for studying language-model architecture, data curricula, and small-model benchmarking.
## Model details
- Architecture: causal decoder-only language model
- Parameters: 3,412,800
- Layers: 7
- Hidden size: 192
- Attention: 3 query heads and 1 key-value head (grouped-query attention)
- Head dimension: 64
- Feed-forward size: 480
- Context length: 512 tokens
- Positional encoding: rotary position embeddings (RoPE)
- RoPE Theta = 5000.0
- Normalization: RMSNorm
- Activation: gated SiLU feed-forward network
- Vocabulary size: 4,096 tokens
- Tokenizer: custom byte-level BPE, exposed as `GPT2TokenizerFast`
- Training tokens: approximately 5 billion
- License: Apache-2.0
The model uses tied input and output embeddings. Its custom attention implementation combines grouped-query attention with XSE.
## Tokenizer
Atom uses a custom byte-level BPE tokenizer trained specifically for this pretraining corpus. The tokenizer has a vocabulary of 4,096 tokens and includes dedicated padding, beginning-of-sequence, end-of-sequence, unknown, and end-of-text tokens.
## Training data and curriculum
Atom was trained on a curriculum combining general web text, educational material, synthetic textbook-style content, and mathematical data. The mixture changed gradually during training: general web data was emphasized earlier, while educational, synthetic, and mathematical material received more weight later.
Approximate proportions over the complete training run were:
| Dataset | Subset / split used | Approximate proportion |
|---|---|---:|
| [HuggingFaceFW/fineweb-edu](https://huggingface.co/datasets/HuggingFaceFW/fineweb-edu) | All available `CC-MAIN-*` configurations under `data/`, `train` split | 39% |
| [openbmb/Ultra-FineWeb](https://huggingface.co/datasets/openbmb/Ultra-FineWeb) | English v1.4 (`ultrafineweb_en_v1_4`; `en` split) | 31% |
| [HuggingFaceTB/finemath](https://huggingface.co/datasets/HuggingFaceTB/finemath) | `finemath-3plus`, `train` split | 12% |
| [HuggingFaceTB/smollm-corpus](https://huggingface.co/datasets/HuggingFaceTB/smollm-corpus) | `cosmopedia-v2`, `train` split | 12% |
| [openbmb/UltraData-Math](https://huggingface.co/datasets/openbmb/UltraData-Math) | `UltraData-Math-L2-preview`, `train` split | 6% |
These percentages describe the approximate aggregate sampling mixture rather than exact document counts. Refer to the individual dataset cards for their source information, licenses, and usage conditions.
## Intended use
This is a small base language model intended for research and benchmarking. It may be useful for experiments involving compact architectures, pretraining curricula, tokenization, evaluation pipelines, and resource-constrained inference.
Atom is a base model and has not been instruction-tuned or aligned for assistant-style interaction.
## Evaluation
Atom was evaluated with EleutherAI's `lm-evaluation-harness` and ArithMark-2.0.
### lm-evaluation-harness
| Task | Metric | Score |
|---|---|---:|
| ARC-Easy | `acc_norm` | 33.08% |
| ARC-Challenge | `acc_norm` | 21.76% |
| HellaSwag | `acc_norm` | 27.65% |
| PIQA | `acc_norm` | 55.71% |
### ArithMark-2.0
| Benchmark | Metric | Score |
|---|---|---:|
| ArithMark-2.0 | `acc` | 27.36% |
**Average score: 34.54%**
## Limitations
Atom is a very small model and should not be expected to produce reliable factual, safety-critical, or instruction-following outputs. Its short context window and limited capacity constrain coherence, knowledge recall, reasoning, and long-form generation.
The model may reproduce errors, biases, or undesirable patterns present in its training data. It has not undergone dedicated safety training and should not be used for high-stakes decisions.
|