Tritter 100M BitNet
A 100M parameter BitNet b1.58 ternary-quantized model trained for code generation.
Training Details
- Parameters: 100,680,704
- Training tokens: 9,782,048
- Final loss: 11.2704
- Min loss: 11.0312
- Tokens/sec: 56136.5
- Training duration: 0:02:54.254825
- GPU: NVIDIA GeForce RTX 5080
Checkpoints
Intermediate checkpoints available at 10%, 20%, ..., 90% progress.
Generated with Tritter
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support