--- language: en license: mit tags: - pytorch - language-model - transformer - tiny-shakespeare library_name: transformers model_name: mini-language-model pipeline_tag: text-generation --- # Mini Language Model ## 🧠 Model Description This is a toy decoder-only language model based on a TransformerDecoder architecture. It was trained from scratch on the [Tiny Shakespeare dataset](https://huggingface.co/datasets/tiny_shakespeare) using PyTorch. The goal was to explore autoregressive language modeling using minimal resources and libraries like torch.nn and transformers. ## 🏋️ Training Details - **Architecture**: TransformerDecoder - **Tokenizer**: GPT2Tokenizer from Hugging Face - **Vocabulary Size**: 50257 (from GPT-2) - **Sequence Length**: 64 tokens - **Batch Size**: 8 - **Epochs**: 5 - **Learning Rate**: 1e-3 - **Number of Parameters**: ~900k - **Hardware**: Trained on CPU (Google Colab) ## 📊 Evaluation The model was evaluated on a 10% validation split. It shows consistent training and validation loss decrease, though it is not expected to produce coherent long text due to the small training size. ## 📂 Intended Use This model is intended for educational purposes only. It is **not suitable for production use**. ## 🚫 Limitations - Only trained on a tiny dataset - Small architecture, limited capacity - Limited ability to generalize or generate meaningful long text ## 💬 Example Usage (Python) python from transformers import GPT2Tokenizer from model import MiniDecoderModel # Assuming you restore the class tokenizer = GPT2Tokenizer.from_pretrained("Pavloria/mini-language-model") model = MiniDecoderModel(...) # Load your config model.load_state_dict(torch.load("pytorch_model.bin"))