--- license: apache-2.0 language: en tags: - causal-lm - from-scratch - transformer - tiny-stories - pytorch - custom-architecture - text-generation --- # TinyWay 1.0.0 **TinyWay 1.0.0** is a **52.94M parameter GPT-style causal language model** trained **from scratch** on the **TinyStories** dataset. The model is designed for **lightweight story generation, research, and educational exploration** of decoder-only Transformer architectures. Unlike fine-tuned models, TinyWay was **implemented, trained, serialized, and released end-to-end**, including a **custom Hugging Face-compatible architecture**. --- ## 🔍 Model Overview | Attribute | Value | |---------|------| | Architecture | Decoder-only Transformer (GPT-style) | | Parameters | **52.94M** | | Layers | 8 | | Hidden size | 384 | | Attention heads | 8 | | Context length | 256 tokens | | Tokenizer | GPT-2 BPE | | Framework | PyTorch | | Precision | FP16 (AMP during training) | --- ## 📚 Training Details - **Dataset**: TinyStories (text file, streamed) - **Training strategy**: Streaming token dataset - **Epochs**: 1 - **Effective batch size**: 64 - **Optimizer**: AdamW - **Learning rate**: 3e-4 - **Dropout**: 0.1 - **Hardware**: NVIDIA Tesla P100 (16GB) - **Environment**: Kaggle The model was trained using **causal language modeling**, predicting the next token given previous tokens. --- ## 🎯 Intended Use TinyWay is suitable for: - Short story generation - Educational demonstrations of Transformer internals - Research on small-scale language models - Understanding end-to-end LLM construction --- ## ⚠️ Limitations - Trained only on narrative-style data (TinyStories) - Not instruction-tuned - Not suitable for factual QA or reasoning-heavy tasks - Limited context window (256 tokens) --- ## 🚀 Usage ### Load and generate text ```python from transformers import AutoConfig, AutoTokenizer, AutoModelForCausalLM model_id = "shivamsharma120120/TinyWay-1.0.0" config = AutoConfig.from_pretrained( model_id, trust_remote_code=True ) tokenizer = AutoTokenizer.from_pretrained( model_id, trust_remote_code=True ) model = AutoModelForCausalLM.from_pretrained( model_id, config=config, trust_remote_code=True ) inputs = tokenizer("Once upon a time", return_tensors="pt") output = model.generate( **inputs, max_new_tokens=100, temperature=0.8, top_p=0.95, do_sample=True ) print(tokenizer.decode(output[0], skip_special_tokens=True))