| # scratch-model | |
| This is a scratch transformer model created using the Incremental Model Trainer. | |
| ## Model Configuration | |
| - **Architecture**: Transformer decoder | |
| - **Parameters**: 9.3M | |
| - **Hidden Size**: 256 | |
| - **Layers**: 8 | |
| - **Attention Heads**: 4 | |
| - **FFN Dimension**: 512 | |
| - **Vocabulary Size**: 8000 | |
| - **Max Sequence Length**: 4096 | |
| - **Dropout**: 0.1 | |
| ## Usage | |
| ```python | |
| from trainer.scratch_model import ScratchModelConfig, ScratchTransformer | |
| config = ScratchModelConfig.from_dict(json.load(open("config.json"))) | |
| model = ScratchTransformer.from_pretrained(".", config) | |
| ``` | |
| Created with [Incremental Model Trainer](https://huggingface.co/spaces/broadfield/incremental-model-trainer) | |