Update README.md

b2742ab verified 7 months ago

1.38 kB

license: cc-by-nc-sa-4.0
datasets:
  - GrainWare/tuxsentience-v1
language:
  - en
base_model:
  - unsloth/Qwen3-8B-GGUF

tuxsentience-beta3

Our second open-weight model, in progress. For now this documents progress and details.

Model Information

It has been decided that this will be based off Qwen3 8B.

It will like the last one most likely be 4-bit, but due to our new training methods (detailed below) we may release larger sizes.

Training Information

We are attempting to train this model via distributed computing, this is how our current setup looks so far:

i9-10910, 32GB RAM, RX 7600 (8GB)
i5-13420H, 16GB RAM, RTX 3050 Mobile (6GB)
i5-12400, 32GB RAM, RTX 3060 (12GB)
Ryzen 7 9800X3D, 32GB RAM, RTX 3080 (10GB)

Amounting to around 98.47 TFLOPS.

In the future we are trying to aquire better hardware and a RX 9070 XT is planned for future models. Currently we are attempting unsloth + ray for distributed computing.

Benchmarks

Coming soon to an accuracy near you

FAQ

Q: This implies the existance of beta1 and alpha versions
A: They do exist, however they were never published and most likely never will be

Made possible by

https://accuratelinuxgraphs.com/ - Benchmarks and data visualization