| | --- |
| | license: cc-by-nc-sa-4.0 |
| | datasets: |
| | - GrainWare/tuxsentience-v1 |
| | language: |
| | - en |
| | base_model: |
| | - unsloth/Qwen3-8B-GGUF |
| | --- |
| | |
| | # tuxsentience-beta3 |
| | Our second open-weight model, in progress. For now this documents progress and details. |
| |
|
| | #### Model Information |
| | It has been decided that this will be based off Qwen3 8B. |
| |
|
| | It will like the last one most likely be 4-bit, but due to our new training methods (detailed below) we may release larger sizes. |
| |
|
| | #### Training Information |
| | We are attempting to train this model via distributed computing, this is how our current setup looks so far: |
| | - i9-10910, 32GB RAM, RX 7600 (8GB) |
| | - i5-13420H, 16GB RAM, RTX 3050 Mobile (6GB) |
| | - i5-12400, 32GB RAM, RTX 3060 (12GB) |
| | - Ryzen 7 9800X3D, 32GB RAM, RTX 3080 (10GB) |
| |
|
| | Amounting to around 98.47 TFLOPS. |
| |  |
| |
|
| | In the future we are trying to aquire better hardware and a RX 9070 XT is planned for future models. Currently we are attempting unsloth + ray for distributed computing. |
| |
|
| | # Benchmarks |
| | > [!IMPORTANT] |
| | > Coming soon to an accuracy near you |
| |
|
| | # FAQ |
| | - Q: **This implies the existance of beta1 and alpha versions** |
| | - A: They do exist, however they were never published and most likely never will be |
| |
|
| | # Made possible by |
| | - https://accuratelinuxgraphs.com/ - Benchmarks and data visualization |