| # scratch-model | |
|  | |
|  | |
| This is a scratch transformer model created using the Incremental Model Trainer. | |
| ## Model Configuration | |
| - **Architecture**: Transformer decoder (GPT2-compatible) | |
| - **Parameters**: 13.5M | |
| - **Hidden Size**: 256 | |
| - **Layers**: 16 | |
| - **Attention Heads**: 16 | |
| - **FFN Dimension**: 512 | |
| - **Vocabulary Size**: 8000 | |
| - **Max Sequence Length**: 4096 | |
| - **Dropout**: 0.1 | |
| ## Usage | |
| ```python | |
| from trainer.scratch_model import ScratchModelCreator | |
| creator = ScratchModelCreator() | |
| # Load from local | |
| model, tokenizer, config = creator.load_with_tokenizer("path/to/model") | |
| # Or load from HuggingFace Hub | |
| local_path = creator.download_from_hub("username/scratch-model-name") | |
| model, tokenizer, config = creator.load_with_tokenizer(local_path) | |
| ``` | |
| ## Loading with Transformers | |
| This model uses a GPT2-compatible configuration but requires the custom `ScratchTransformer` class to load. Use the `ScratchModelCreator` as shown above. | |
| Created with [Incremental Model Trainer](https://huggingface.co/spaces/broadfield/incremental-model-trainer) | |