YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

scratch-model

Model Type Parameters

This is a scratch transformer model created using the Incremental Model Trainer.

Model Configuration

  • Architecture: Transformer decoder (GPT2-compatible)
  • Parameters: 13.5M
  • Hidden Size: 256
  • Layers: 16
  • Attention Heads: 16
  • FFN Dimension: 512
  • Vocabulary Size: 8000
  • Max Sequence Length: 4096
  • Dropout: 0.1

Usage

from trainer.scratch_model import ScratchModelCreator

creator = ScratchModelCreator()
# Load from local
model, tokenizer, config = creator.load_with_tokenizer("path/to/model")

# Or load from HuggingFace Hub
local_path = creator.download_from_hub("username/scratch-model-name")
model, tokenizer, config = creator.load_with_tokenizer(local_path)

Loading with Transformers

This model uses a GPT2-compatible configuration but requires the custom ScratchTransformer class to load. Use the ScratchModelCreator as shown above.

Created with Incremental Model Trainer

Downloads last month
7
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support