Balanced Strategy - V8-BALANCED

Submitted by: MDaytek Strategy: Optimal balance

Configuration

  • Vocab: 104 tokens
  • Embedding: 128
  • Layers: 5
  • Tying: True
  • Params: 1,037,696

Why This Works

Optimal balance between vocab, embedding size, and depth!

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support