File size: 3,221 Bytes
308c282 045f9f9 308c282 045f9f9 308c282 e386971 308c282 e386971 308c282 045f9f9 308c282 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 | ---
language:
- en
license: apache-2.0
library_name: transformers
tags:
- ruby
- rails
- code-generation
- gguf
- fine-tuned
- lora
- unsloth
pipeline_tag: text-generation
base_model: Qwen/Qwen3-8B
model-index:
- name: qwen3-8b-rails
results: []
---
# qwen3-8b-rails
An 8B parameter dense model fine-tuned for **Ruby on Rails code generation**. Trained on 111,000 samples extracted from our own internal Rails projects. Small enough to run on a laptop.
Built by [Bytecode](https://bytecode.hr).
## Model Details
| Property | Value |
|---|---|
| Base model | [Qwen3-8B](https://huggingface.co/Qwen/Qwen3-8B) |
| Architecture | Qwen3 dense (8B parameters) |
| Training method | QLoRA (rank 16) via [Unsloth](https://github.com/unslothai/unsloth) |
| Training data | 111K samples from internal Rails projects |
| Training cost | ~$21 (A100 80GB, ~17 hours) |
| Quantization | GGUF Q4_K_M (5.03 GB) |
## What it does
This model writes idiomatic Ruby on Rails code following specific conventions:
- Devise authentication
- Namespaced concerns instead of service objects
- Sidekiq instead of Solid Queue
- State-as-records instead of boolean flags
- DaisyUI drawer layouts instead of ActiveAdmin
The 8B model is the lightweight option — fast enough for inline code completion, small enough to run alongside your development server without swapping.
## Usage with Ollama
```bash
# Download and run
ollama run bytecodehr/qwen3-8b-rails
# Example prompt
ollama run bytecodehr/qwen3-8b-rails "Write a Rails migration for a subscriptions table with plan, status, and billing cycle"
```
### Memory requirements
| Format | GGUF Size | Min RAM | Recommended |
|---|---|---|---|
| Q4_K_M | 5.03 GB | 8 GB | 16 GB |
Fits comfortably on any modern laptop. GGUF file size + 2–3 GB for KV cache.
## Training
Trained with LoRA (rank 16, alpha 16) on attention projection layers. Only 0.78% of parameters were trained. The full training run took ~17 hours on a single A100 80GB GPU.
The dataset:
1. Our internal Rails projects
2. 15-step cleaning and deduplication pipeline
3. 111K final training samples with contrastive pairs
4. Source diversity cap at 20% per repository
Full details in our blog posts:
- [Part 1: Dataset Engineering](https://bytecode.hr/posts/training-rails-llms-part-1-dataset-engineering)
- [Part 2: Training, Quantization, and Deployment](https://bytecode.hr/posts/training-rails-llms-part-2-training-quantization-deployment)
## Why Ruby for LLMs?
Ruby uses 42–45% fewer tokens than TypeScript across every major LLM tokenizer. Fewer tokens means more code in the context window, faster generations, and lower costs. Read our analysis: [Why Ruby Is the Better Language for LLM-Powered Development](https://bytecode.hr/posts/why-ruby-is-the-better-language-for-llm-powered-development).
## Other models
- [bytecodehr/qwen3-coder-30b-rails](https://huggingface.co/bytecodehr/qwen3-coder-30b-rails) — 31B MoE flagship model (18–21 GB)
- [bytecodehr/qwen2.5-coder-7b-rails](https://huggingface.co/bytecodehr/qwen2.5-coder-7b-rails) — 7B LoRA adapter
- [bytecodehr/qwen2.5-coder-3b-rails](https://huggingface.co/bytecodehr/qwen2.5-coder-3b-rails) — 3B LoRA adapter |