|
|
--- |
|
|
language: en |
|
|
license: apache-2.0 |
|
|
library_name: transformers |
|
|
tags: |
|
|
- text-generation |
|
|
- pytorch |
|
|
- gpt |
|
|
- language-model |
|
|
--- |
|
|
|
|
|
# tinyMind |
|
|
|
|
|
This is a small transformer language model trained from scratch with approximately 17,731,328 parameters. |
|
|
|
|
|
## Model Details |
|
|
|
|
|
- **Architecture**: GPT-style transformer |
|
|
- **Parameters**: ~17M |
|
|
- **Layers**: 6 |
|
|
- **Attention Heads**: 8 |
|
|
- **Embedding Dimension**: 256 |
|
|
- **Max Sequence Length**: 512 |
|
|
- **Vocabulary Size**: 50257 |
|
|
|
|
|
## Training Data |
|
|
|
|
|
The model was trained on a diverse mixture of high-quality text data including: |
|
|
- OpenWebText |
|
|
- Wikipedia articles |
|
|
- BookCorpus |
|
|
- Other curated text sources |
|
|
|
|
|
## Usage |
|
|
|
|
|
```python |
|
|
from transformers import GPT2TokenizerFast, AutoModelForCausalLM |
|
|
|
|
|
tokenizer = GPT2TokenizerFast.from_pretrained("HenrySentinel/tinyMind") |
|
|
model = AutoModelForCausalLM.from_pretrained("HenrySentinel/tinyMind") |
|
|
|
|
|
# Generate text |
|
|
input_text = "The key to artificial intelligence is" |
|
|
input_ids = tokenizer.encode(input_text, return_tensors="pt") |
|
|
output = model.generate(input_ids, max_length=100, temperature=0.8, do_sample=True) |
|
|
generated_text = tokenizer.decode(output[0], skip_special_tokens=True) |
|
|
print(generated_text) |
|
|
``` |
|
|
|
|
|
## Training Details |
|
|
|
|
|
- **Optimizer**: AdamW with cosine learning rate scheduling |
|
|
- **Learning Rate**: 0.001 |
|
|
- **Batch Size**: 8 |
|
|
- **Sequence Length**: 512 |
|
|
- **Epochs**: 3 |
|
|
- **Gradient Clipping**: 1.0 |
|
|
|
|
|
## Limitations |
|
|
|
|
|
This is a small model designed for experimentation and learning. It may: |
|
|
- Generate inconsistent or factually incorrect content |
|
|
- Have limited knowledge compared to larger models |
|
|
- Require careful prompt engineering for best results |
|
|
|
|
|
## License |
|
|
|
|
|
Apache 2.0 |
|
|
|