| --- |
| language: |
| - en |
| license: apache-2.0 |
| library_name: transformers |
| tags: |
| - ruby |
| - rails |
| - code-generation |
| - gguf |
| - fine-tuned |
| - lora |
| - unsloth |
| pipeline_tag: text-generation |
| base_model: Qwen/Qwen3-Coder-30B-A3B-Instruct |
| model-index: |
| - name: qwen3-coder-30b-rails |
| results: [] |
| --- |
| |
| # qwen3-coder-30b-rails |
|
|
| A 31B parameter Mixture-of-Experts model fine-tuned for **Ruby on Rails code generation**. Trained on 111,000 samples extracted from our own internal Rails projects. |
|
|
| Built by [Bytecode](https://bytecode.hr). |
|
|
| ## Model Details |
|
|
| | Property | Value | |
| |---|---| |
| | Base model | [Qwen3-Coder-30B-A3B-Instruct](https://huggingface.co/Qwen/Qwen3-Coder-30B-A3B-Instruct) | |
| | Architecture | Qwen3 MoE (31B total, 3B active) | |
| | Training method | QLoRA (rank 16) via [Unsloth](https://github.com/unslothai/unsloth) | |
| | Training data | 111K samples from internal Rails projects | |
| | Training cost | ~$32 (A100 80GB, ~26 hours) | |
| | Quantization | GGUF Q4_K_M (18.6 GB), Q5_K_M (21.7 GB) | |
|
|
| ## What it does |
|
|
| This model writes idiomatic Ruby on Rails code following specific conventions: |
|
|
| - Devise authentication |
| - Namespaced concerns instead of service objects |
| - Sidekiq instead of Solid Queue |
| - State-as-records instead of boolean flags |
| - DaisyUI drawer layouts instead of ActiveAdmin |
|
|
| It generates code that follows these patterns without prompt engineering β the conventions are baked into the weights. |
|
|
| ## Usage with Ollama |
|
|
| ```bash |
| # Download and run |
| ollama run bytecodehr/qwen3-coder-30b-rails |
| |
| # Example prompt |
| ollama run bytecodehr/qwen3-coder-30b-rails "Write a Rails controller for managing user subscriptions with state transitions" |
| ``` |
|
|
| ### Memory requirements |
|
|
| | Format | GGUF Size | Min RAM | Recommended | |
| |---|---|---|---| |
| | Q5_K_M | 21.7 GB | 24 GB | 32 GB | |
| | Q4_K_M | 18.6 GB | 20 GB | 24 GB | |
|
|
| Rule of thumb: GGUF file size + 2β4 GB for KV cache and overhead. |
|
|
| ## Training |
|
|
| Trained with LoRA (rank 16, alpha 16) on attention projection layers (`q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj`). Only 0.78% of parameters were trained. |
|
|
| The dataset pipeline: |
| 1. Extracted code from our internal Rails projects |
| 2. 15-step cleaning and deduplication pipeline |
| 3. 111K final training samples |
| 4. Includes 29 contrastive pairs (wrong way vs right way) |
| 5. Source diversity cap at 20% per repository |
|
|
| Full details in our blog posts: |
| - [Part 1: Dataset Engineering](https://bytecode.hr/posts/training-rails-llms-part-1-dataset-engineering) |
| - [Part 2: Training, Quantization, and Deployment](https://bytecode.hr/posts/training-rails-llms-part-2-training-quantization-deployment) |
|
|
| ## Why Ruby for LLMs? |
|
|
| Ruby uses 42β45% fewer tokens than TypeScript across every major LLM tokenizer. That means more code fits in the context window, generations are faster, and costs are lower. Read our analysis: [Why Ruby Is the Better Language for LLM-Powered Development](https://bytecode.hr/posts/why-ruby-is-the-better-language-for-llm-powered-development). |
|
|
| ## Other models |
|
|
| - [bytecodehr/qwen3-8b-rails](https://huggingface.co/bytecodehr/qwen3-8b-rails) β 8B dense model, runs on laptops (5 GB) |
| - [bytecodehr/qwen2.5-coder-7b-rails](https://huggingface.co/bytecodehr/qwen2.5-coder-7b-rails) β 7B LoRA adapter |
| - [bytecodehr/qwen2.5-coder-3b-rails](https://huggingface.co/bytecodehr/qwen2.5-coder-3b-rails) β 3B LoRA adapter |