Tritter 500M BitNet
A 500M parameter BitNet b1.58 ternary-quantized model trained for code generation.
Training Details
- Parameters: 524,372,480
- Training tokens: 118,111,072
- Final loss: 11.2343
- Min loss: 11.0722
- Tokens/sec: 23679.4
- Training duration: 1:23:07.915359
- GPU: NVIDIA GeForce RTX 5080
Checkpoints
Intermediate checkpoints available at 10%, 20%, ..., 90% progress.
Generated with Tritter
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support