| | --- |
| | language: en |
| | license: apache-2.0 |
| | library_name: transformers |
| | tags: |
| | - text-generation |
| | - pytorch |
| | - gpt |
| | - language-model |
| | --- |
| | |
| | # tinyMind |
| |
|
| | This is a small transformer language model trained from scratch with approximately 17,731,328 parameters. |
| |
|
| | ## Model Details |
| |
|
| | - **Architecture**: GPT-style transformer |
| | - **Parameters**: ~17M |
| | - **Layers**: 6 |
| | - **Attention Heads**: 8 |
| | - **Embedding Dimension**: 256 |
| | - **Max Sequence Length**: 512 |
| | - **Vocabulary Size**: 50257 |
| |
|
| | ## Training Data |
| |
|
| | The model was trained on a diverse mixture of high-quality text data including: |
| | - OpenWebText |
| | - Wikipedia articles |
| | - BookCorpus |
| | - Other curated text sources |
| |
|
| | ## Usage |
| |
|
| | ```python |
| | from transformers import GPT2TokenizerFast, AutoModelForCausalLM |
| | |
| | tokenizer = GPT2TokenizerFast.from_pretrained("HenrySentinel/tinyMind") |
| | model = AutoModelForCausalLM.from_pretrained("HenrySentinel/tinyMind") |
| | |
| | # Generate text |
| | input_text = "The key to artificial intelligence is" |
| | input_ids = tokenizer.encode(input_text, return_tensors="pt") |
| | output = model.generate(input_ids, max_length=100, temperature=0.8, do_sample=True) |
| | generated_text = tokenizer.decode(output[0], skip_special_tokens=True) |
| | print(generated_text) |
| | ``` |
| |
|
| | ## Training Details |
| |
|
| | - **Optimizer**: AdamW with cosine learning rate scheduling |
| | - **Learning Rate**: 0.001 |
| | - **Batch Size**: 8 |
| | - **Sequence Length**: 512 |
| | - **Epochs**: 3 |
| | - **Gradient Clipping**: 1.0 |
| |
|
| | ## Limitations |
| |
|
| | This is a small model designed for experimentation and learning. It may: |
| | - Generate inconsistent or factually incorrect content |
| | - Have limited knowledge compared to larger models |
| | - Require careful prompt engineering for best results |
| |
|
| | ## License |
| |
|
| | Apache 2.0 |
| |
|