bytecodehr
/

qwen3-8b-rails

@@ -20,7 +20,7 @@ model-index:
 # qwen3-8b-rails
-An 8B parameter dense model fine-tuned for **Ruby on Rails code generation**. Trained on 111,000 samples extracted from 45 Rails repositories. Small enough to run on a laptop.
 Built by [Bytecode](https://bytecode.hr).
@@ -31,7 +31,7 @@ Built by [Bytecode](https://bytecode.hr).
 | Base model | [Qwen3-8B](https://huggingface.co/Qwen/Qwen3-8B) |
 | Architecture | Qwen3 dense (8B parameters) |
 | Training method | QLoRA (rank 16) via [Unsloth](https://github.com/unslothai/unsloth) |
-| Training data | 111K samples from 45 Rails repos |
 | Training cost | ~$21 (A100 80GB, ~17 hours) |
 | Quantization | GGUF Q4_K_M (5.03 GB) |
@@ -70,7 +70,7 @@ Fits comfortably on any modern laptop. GGUF file size + 2–3 GB for KV cache.
 Trained with LoRA (rank 16, alpha 16) on attention projection layers. Only 0.78% of parameters were trained. The full training run took ~17 hours on a single A100 80GB GPU.
 The dataset:
-1. 45 Rails repos (35 private + 10 open-source)
 2. 15-step cleaning and deduplication pipeline
 3. 111K final training samples with contrastive pairs
 4. Source diversity cap at 20% per repository

 # qwen3-8b-rails
+An 8B parameter dense model fine-tuned for **Ruby on Rails code generation**. Trained on 111,000 samples extracted from our own internal Rails projects. Small enough to run on a laptop.
 Built by [Bytecode](https://bytecode.hr).
 | Base model | [Qwen3-8B](https://huggingface.co/Qwen/Qwen3-8B) |
 | Architecture | Qwen3 dense (8B parameters) |
 | Training method | QLoRA (rank 16) via [Unsloth](https://github.com/unslothai/unsloth) |
+| Training data | 111K samples from internal Rails projects |
 | Training cost | ~$21 (A100 80GB, ~17 hours) |
 | Quantization | GGUF Q4_K_M (5.03 GB) |
 Trained with LoRA (rank 16, alpha 16) on attention projection layers. Only 0.78% of parameters were trained. The full training run took ~17 hours on a single A100 80GB GPU.
 The dataset:
+1. Our internal Rails projects
 2. 15-step cleaning and deduplication pipeline
 3. 111K final training samples with contrastive pairs
 4. Source diversity cap at 20% per repository