GrainWare
/

tuxsentience-beta3

Model card Files Files and versions

tuxsentience-beta3 / README.md

electron271's picture

Update README.md

b2742ab verified 7 months ago

|

history blame contribute delete

1.38 kB

	---
	license: cc-by-nc-sa-4.0
	datasets:
	- GrainWare/tuxsentience-v1
	language:
	- en
	base_model:
	- unsloth/Qwen3-8B-GGUF
	---

	# tuxsentience-beta3
	Our second open-weight model, in progress. For now this documents progress and details.

	#### Model Information
	It has been decided that this will be based off Qwen3 8B.

	It will like the last one most likely be 4-bit, but due to our new training methods (detailed below) we may release larger sizes.

	#### Training Information
	We are attempting to train this model via distributed computing, this is how our current setup looks so far:
	- i9-10910, 32GB RAM, RX 7600 (8GB)
	- i5-13420H, 16GB RAM, RTX 3050 Mobile (6GB)
	- i5-12400, 32GB RAM, RTX 3060 (12GB)
	- Ryzen 7 9800X3D, 32GB RAM, RTX 3080 (10GB)

	Amounting to around 98.47 TFLOPS.
	![image/png](https://cdn-uploads.huggingface.co/production/uploads/6892e1d075d5f81b666d5938/__b3LqmWdLvv2ckMgAC9L.png)

	In the future we are trying to aquire better hardware and a RX 9070 XT is planned for future models. Currently we are attempting unsloth + ray for distributed computing.

	# Benchmarks
	> [!IMPORTANT]
	> Coming soon to an accuracy near you

	# FAQ
	- Q: This implies the existance of beta1 and alpha versions
	- A: They do exist, however they were never published and most likely never will be

	# Made possible by
	- https://accuratelinuxgraphs.com/ - Benchmarks and data visualization