TinyBit
A compact GPT-style language model with 12.96M parameters, trained from scratch using PyTorch. Designed for research, experimentation, and resource-constrained environments (CPU-friendly).
Model Details
| Property | Value |
|---|---|
| Architecture | GPT (decoder-only) |
| Parameters | 12.96M |
| Framework | PyTorch |
| Tokenizer | SentencePiece (BPE) |
| License | MIT |
Files
model.ptโ PyTorch model weightstokenizer.modelโ SentencePiece tokenizer model
Usage
import torch
import sentencepiece as spm
sp = spm.SentencePieceProcessor()
sp.load("tokenizer.model")
model = torch.load("model.pt", map_location="cpu")
model.eval()
tokens = sp.encode("Hello, world!", out_type=int)
input_ids = torch.tensor([tokens])
Training
TinyBit was trained from scratch on a custom dataset. The architecture follows a standard GPT design with learned positional embeddings, multi-head self-attention, and feed-forward layers.
Intended Use
- Language modeling research
- Educational purposes
- Lightweight text generation on CPU
- Fine-tuning experiments
Limitations
- Small parameter count limits generation quality
- Not aligned or fine-tuned for instruction following
- May produce repetitive or incoherent text on out-of-domain inputs
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support