TinyWay 1.0.0
TinyWay 1.0.0 is a 52.94M parameter GPT-style causal language model trained from scratch on the TinyStories dataset.
The model is designed for lightweight story generation, research, and educational exploration of decoder-only Transformer architectures.
Unlike fine-tuned models, TinyWay was implemented, trained, serialized, and released end-to-end, including a custom Hugging Face-compatible architecture.
π Model Overview
| Attribute | Value |
|---|---|
| Architecture | Decoder-only Transformer (GPT-style) |
| Parameters | 52.94M |
| Layers | 8 |
| Hidden size | 384 |
| Attention heads | 8 |
| Context length | 256 tokens |
| Tokenizer | GPT-2 BPE |
| Framework | PyTorch |
| Precision | FP16 (AMP during training) |
π Training Details
- Dataset: TinyStories (text file, streamed)
- Training strategy: Streaming token dataset
- Epochs: 1
- Effective batch size: 64
- Optimizer: AdamW
- Learning rate: 3e-4
- Dropout: 0.1
- Hardware: NVIDIA Tesla P100 (16GB)
- Environment: Kaggle
The model was trained using causal language modeling, predicting the next token given previous tokens.
π― Intended Use
TinyWay is suitable for:
- Short story generation
- Educational demonstrations of Transformer internals
- Research on small-scale language models
- Understanding end-to-end LLM construction
β οΈ Limitations
- Trained only on narrative-style data (TinyStories)
- Not instruction-tuned
- Not suitable for factual QA or reasoning-heavy tasks
- Limited context window (256 tokens)
π Usage
Load and generate text
from transformers import AutoConfig, AutoTokenizer, AutoModelForCausalLM
model_id = "shivamsharma120120/TinyWay-1.0.0"
config = AutoConfig.from_pretrained(
model_id,
trust_remote_code=True
)
tokenizer = AutoTokenizer.from_pretrained(
model_id,
trust_remote_code=True
)
model = AutoModelForCausalLM.from_pretrained(
model_id,
config=config,
trust_remote_code=True
)
inputs = tokenizer("Once upon a time", return_tensors="pt")
output = model.generate(
**inputs,
max_new_tokens=100,
temperature=0.8,
top_p=0.95,
do_sample=True
)
print(tokenizer.decode(output[0], skip_special_tokens=True))
- Downloads last month
- 15