broadfield
/

scratch-model-1776352887

Model card Files Files and versions

scratch-model-1776352887 / README.md

broadfield's picture

Create scratch-model - 13.5M parameters

10942ba verified about 1 month ago

|

history blame contribute delete

1.19 kB

	# scratch-model

	![Model Type](https://img.shields.io/badge/Model%20Type-Scratch%20Transformer-blue)
	![Parameters](https://img.shields.io/badge/Parameters-13.5M-green)

	This is a scratch transformer model created using the Incremental Model Trainer.

	## Model Configuration

	- Architecture: Transformer decoder (GPT2-compatible)
	- Parameters: 13.5M
	- Hidden Size: 256
	- Layers: 16
	- Attention Heads: 16
	- FFN Dimension: 512
	- Vocabulary Size: 8000
	- Max Sequence Length: 4096
	- Dropout: 0.1

	## Usage

	```python
	from trainer.scratch_model import ScratchModelCreator

	creator = ScratchModelCreator()
	# Load from local
	model, tokenizer, config = creator.load_with_tokenizer("path/to/model")

	# Or load from HuggingFace Hub
	local_path = creator.download_from_hub("username/scratch-model-name")
	model, tokenizer, config = creator.load_with_tokenizer(local_path)
	```

	## Loading with Transformers

	This model uses a GPT2-compatible configuration but requires the custom `ScratchTransformer` class to load. Use the `ScratchModelCreator` as shown above.

	Created with [Incremental Model Trainer](https://huggingface.co/spaces/broadfield/incremental-model-trainer)