Tritter 100M BitNet

A 100M parameter BitNet b1.58 ternary-quantized model trained for code generation.

Training Details

  • Parameters: 100,680,704
  • Training tokens: 9,782,048
  • Final loss: 11.2704
  • Min loss: 11.0312
  • Tokens/sec: 56136.5
  • Training duration: 0:02:54.254825
  • GPU: NVIDIA GeForce RTX 5080

Checkpoints

Intermediate checkpoints available at 10%, 20%, ..., 90% progress.

Generated with Tritter

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support