broadfield's picture
Create scratch-model - 13.5M parameters
10942ba verified
# scratch-model
![Model Type](https://img.shields.io/badge/Model%20Type-Scratch%20Transformer-blue)
![Parameters](https://img.shields.io/badge/Parameters-13.5M-green)
This is a scratch transformer model created using the Incremental Model Trainer.
## Model Configuration
- **Architecture**: Transformer decoder (GPT2-compatible)
- **Parameters**: 13.5M
- **Hidden Size**: 256
- **Layers**: 16
- **Attention Heads**: 16
- **FFN Dimension**: 512
- **Vocabulary Size**: 8000
- **Max Sequence Length**: 4096
- **Dropout**: 0.1
## Usage
```python
from trainer.scratch_model import ScratchModelCreator
creator = ScratchModelCreator()
# Load from local
model, tokenizer, config = creator.load_with_tokenizer("path/to/model")
# Or load from HuggingFace Hub
local_path = creator.download_from_hub("username/scratch-model-name")
model, tokenizer, config = creator.load_with_tokenizer(local_path)
```
## Loading with Transformers
This model uses a GPT2-compatible configuration but requires the custom `ScratchTransformer` class to load. Use the `ScratchModelCreator` as shown above.
Created with [Incremental Model Trainer](https://huggingface.co/spaces/broadfield/incremental-model-trainer)