Tritter 500M BitNet

A 500M parameter BitNet b1.58 ternary-quantized model trained for code generation.

Training Details

  • Parameters: 524,372,480
  • Training tokens: 118,111,072
  • Final loss: 11.2343
  • Min loss: 11.0722
  • Tokens/sec: 23679.4
  • Training duration: 1:23:07.915359
  • GPU: NVIDIA GeForce RTX 5080

Checkpoints

Intermediate checkpoints available at 10%, 20%, ..., 90% progress.

Generated with Tritter

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support