testingmodel / README.md
shivash's picture
Upload Enhanced Hybrid Transformer 416M weights πŸš€
701cfd9 verified

Enhanced Hybrid Transformer 416M

πŸš€ 416,417,792 parameter transformer with modern optimizations.

Features

  • 24 layers Γ— 16 heads
  • GQA-4 (Grouped Query Attention)
  • SwiGLU activation
  • RMSNorm normalization
  • RoPE positional embeddings

Contents

  • pytorch_model.bin - Model weights
  • config.json - Model configuration
  • tokenizer.json - Tokenizer files
  • README.md - This file

Usage

Load with the original repository code for full functionality.


πŸš€ Generated with Claude Code