File size: 3,370 Bytes
caefeda
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ef08be0
caefeda
 
 
 
 
 
 
 
 
 
ef08be0
caefeda
 
 
 
 
 
 
0d29684
caefeda
0d29684
caefeda
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ef08be0
caefeda
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
---
language:
  - en
license: apache-2.0
library_name: transformers
tags:
  - ruby
  - rails
  - code-generation
  - gguf
  - fine-tuned
  - lora
  - unsloth
pipeline_tag: text-generation
base_model: Qwen/Qwen3-Coder-30B-A3B-Instruct
model-index:
  - name: qwen3-coder-30b-rails
    results: []
---

# qwen3-coder-30b-rails

A 31B parameter Mixture-of-Experts model fine-tuned for **Ruby on Rails code generation**. Trained on 111,000 samples extracted from our own internal Rails projects.

Built by [Bytecode](https://bytecode.hr).

## Model Details

| Property | Value |
|---|---|
| Base model | [Qwen3-Coder-30B-A3B-Instruct](https://huggingface.co/Qwen/Qwen3-Coder-30B-A3B-Instruct) |
| Architecture | Qwen3 MoE (31B total, 3B active) |
| Training method | QLoRA (rank 16) via [Unsloth](https://github.com/unslothai/unsloth) |
| Training data | 111K samples from internal Rails projects |
| Training cost | ~$32 (A100 80GB, ~26 hours) |
| Quantization | GGUF Q4_K_M (18.6 GB), Q5_K_M (21.7 GB) |

## What it does

This model writes idiomatic Ruby on Rails code following specific conventions:

- Devise authentication
- Namespaced concerns instead of service objects
- Sidekiq instead of Solid Queue
- State-as-records instead of boolean flags
- DaisyUI drawer layouts instead of ActiveAdmin

It generates code that follows these patterns without prompt engineering — the conventions are baked into the weights.

## Usage with Ollama

```bash
# Download and run
ollama run bytecodehr/qwen3-coder-30b-rails

# Example prompt
ollama run bytecodehr/qwen3-coder-30b-rails "Write a Rails controller for managing user subscriptions with state transitions"
```

### Memory requirements

| Format | GGUF Size | Min RAM | Recommended |
|---|---|---|---|
| Q5_K_M | 21.7 GB | 24 GB | 32 GB |
| Q4_K_M | 18.6 GB | 20 GB | 24 GB |

Rule of thumb: GGUF file size + 2–4 GB for KV cache and overhead.

## Training

Trained with LoRA (rank 16, alpha 16) on attention projection layers (`q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj`). Only 0.78% of parameters were trained.

The dataset pipeline:
1. Extracted code from our internal Rails projects
2. 15-step cleaning and deduplication pipeline
3. 111K final training samples
4. Includes 29 contrastive pairs (wrong way vs right way)
5. Source diversity cap at 20% per repository

Full details in our blog posts:
- [Part 1: Dataset Engineering](https://bytecode.hr/posts/training-rails-llms-part-1-dataset-engineering)
- [Part 2: Training, Quantization, and Deployment](https://bytecode.hr/posts/training-rails-llms-part-2-training-quantization-deployment)

## Why Ruby for LLMs?

Ruby uses 42–45% fewer tokens than TypeScript across every major LLM tokenizer. That means more code fits in the context window, generations are faster, and costs are lower. Read our analysis: [Why Ruby Is the Better Language for LLM-Powered Development](https://bytecode.hr/posts/why-ruby-is-the-better-language-for-llm-powered-development).

## Other models

- [bytecodehr/qwen3-8b-rails](https://huggingface.co/bytecodehr/qwen3-8b-rails) — 8B dense model, runs on laptops (5 GB)
- [bytecodehr/qwen2.5-coder-7b-rails](https://huggingface.co/bytecodehr/qwen2.5-coder-7b-rails) — 7B LoRA adapter
- [bytecodehr/qwen2.5-coder-3b-rails](https://huggingface.co/bytecodehr/qwen2.5-coder-3b-rails) — 3B LoRA adapter