FlashLM v5.2 "Nova-Ignition"
5.0M parameter language model designed for 2-CPU/5GB RAM environments. Trained for 2 hours on free-tier cloud CPU. No GPU β not for training, not for inference.
Model Details
- Architecture: Standard Transformer with Rotary Positional Embeddings (RoPE)
- Parameters: ~5.0M
- Vocab Size: 4,096 (BPE)
- Context Length: 128 tokens
- d_model: 256
- Layers: 6
- Attention Heads: 4
- FFN Hidden: 512
- Activation: GELU
- Weight Tying: Yes (embedding β head)
Architecture
Embedding (4K Γ 256, float, weight-tied)
β 6 Γ NovaBlock:
LayerNorm β MultiHeadAttention (RoPE) + residual
LayerNorm β FFN (GELU, 256β512β256) + residual
β LayerNorm β Output Head (tied to embedding)
Training
- Dataset: TinyStories V2 (validation split)
- Training Time: 2 hours
- Hardware: Free-tier cloud CPU (2 threads, 5GB RAM)
- Speed: ~3,500 tokens/sec
Benchmark Results
| Model | Params | BPC | PPL | Hardware |
|---|---|---|---|---|
| FlashLM v5.2 | 5.0M | 0.78 | 10.56 | 2-thread CPU |
| FlashLM v4 "Bolt" | 4.3M | 0.88 | 15.05 | 2-thread CPU |
| TinyStories-1M | 3.7M | 0.62 | 6.72 | V100 GPU |
v5.2 beats v4 by 11% relative in BPC with the same training time (2 hours)!
Usage
import torch
from tokenizers import Tokenizer
import torch.nn as nn
import torch.nn.functional as F
# Load tokenizer
tokenizer = Tokenizer.from_file("tokenizer.json")
# Load model (requires architecture definition - see model.py)
model = NovaIgnitionLM(vocab=4096, d_model=256, n_layers=6,
n_heads=4, d_head=64, d_ffn=512)
model.load_state_dict(torch.load("best.pt", weights_only=True))
# Generate
prompt = "Once upon a time"
ids = tokenizer.encode(prompt).ids
x = torch.tensor([ids])
out = model.generate(x, max_new_tokens=80, temperature=0.8, top_k=40)
text = tokenizer.decode(out[0].tolist())
print(text)
Files
best.pt- Best model checkpointlatest.pt- Latest checkpointconfig.json- Training configuration
Limitations
- Small context window (128 tokens)
- Trained on limited data (~20M tokens)
- Not suitable for complex reasoning tasks
License
MIT
Citation
@misc{flashlm-v52,
author = {Chang Cheng},
title = {FlashLM v5.2 Nova-Ignition},
year = {2026},
url = {https://github.com/changcheng967/FlashLM}
}
- Downloads last month
- 34
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support