File size: 562 Bytes
701cfd9 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
# Enhanced Hybrid Transformer 416M
๐ **416,417,792 parameter** transformer with modern optimizations.
## Features
- **24 layers** ร **16 heads**
- **GQA-4** (Grouped Query Attention)
- **SwiGLU** activation
- **RMSNorm** normalization
- **RoPE** positional embeddings
## Contents
- `pytorch_model.bin` - Model weights
- `config.json` - Model configuration
- `tokenizer.json` - Tokenizer files
- `README.md` - This file
## Usage
Load with the original repository code for full functionality.
---
๐ Generated with [Claude Code](https://claude.ai/code)
|