HeNLP/HeDC4
Viewer • Updated • 910k • 361 • 5
A Modern Hebrew specialized LLM based on the RWKVv6 Architecture
Trained only on Modern Hebrew datasets, with a custom vocabulary optimized for Modern Hebrew
Trained at Tel Aviv Makers Hackerspace
Layers 12
Depth 512
Head size 64
Train ctx_len 512
Train tokens 6,841,411,389 (6 Billion)
Vocab size 65536
All compute was performed on a single Nvidia P40 card
Experiments: 62 hours 52 Minutes (2.6 days)
Training run: 208 hours 10 Minutes (8.6 days)