--- language: en license: apache-2.0 library_name: transformers tags: - text-generation - pytorch - gpt - language-model --- # tinyMind This is a small transformer language model trained from scratch with approximately 17,731,328 parameters. ## Model Details - **Architecture**: GPT-style transformer - **Parameters**: ~17M - **Layers**: 6 - **Attention Heads**: 8 - **Embedding Dimension**: 256 - **Max Sequence Length**: 512 - **Vocabulary Size**: 50257 ## Training Data The model was trained on a diverse mixture of high-quality text data including: - OpenWebText - Wikipedia articles - BookCorpus - Other curated text sources ## Usage ```python from transformers import GPT2TokenizerFast, AutoModelForCausalLM tokenizer = GPT2TokenizerFast.from_pretrained("HenrySentinel/tinyMind") model = AutoModelForCausalLM.from_pretrained("HenrySentinel/tinyMind") # Generate text input_text = "The key to artificial intelligence is" input_ids = tokenizer.encode(input_text, return_tensors="pt") output = model.generate(input_ids, max_length=100, temperature=0.8, do_sample=True) generated_text = tokenizer.decode(output[0], skip_special_tokens=True) print(generated_text) ``` ## Training Details - **Optimizer**: AdamW with cosine learning rate scheduling - **Learning Rate**: 0.001 - **Batch Size**: 8 - **Sequence Length**: 512 - **Epochs**: 3 - **Gradient Clipping**: 1.0 ## Limitations This is a small model designed for experimentation and learning. It may: - Generate inconsistent or factually incorrect content - Have limited knowledge compared to larger models - Require careful prompt engineering for best results ## License Apache 2.0