Add model card

caefeda verified 25 days ago

3.48 kB

language:
  - en
license: apache-2.0
library_name: transformers
tags:
  - ruby
  - rails
  - code-generation
  - gguf
  - fine-tuned
  - lora
  - unsloth
pipeline_tag: text-generation
base_model: Qwen/Qwen3-Coder-30B-A3B-Instruct
model-index:
  - name: qwen3-coder-30b-rails
    results: []

qwen3-coder-30b-rails

A 31B parameter Mixture-of-Experts model fine-tuned for Ruby on Rails code generation. Trained on 111,000 samples extracted from 45 Rails repositories — 35 private client projects and 10 open-source codebases.

Built by Bytecode.

Model Details

Property	Value
Base model	Qwen3-Coder-30B-A3B-Instruct
Architecture	Qwen3 MoE (31B total, 3B active)
Training method	QLoRA (rank 16) via Unsloth
Training data	111K samples from 45 Rails repos
Training cost	~$32 (A100 80GB, ~26 hours)
Quantization	GGUF Q4_K_M (18.6 GB), Q5_K_M (21.7 GB)

What it does

This model writes idiomatic Ruby on Rails code following specific conventions:

Custom authentication with Identity and MagicLink models (not Devise)
Namespaced concerns instead of service objects
Solid Queue instead of Sidekiq
State-as-records instead of boolean flags
DaisyUI drawer layouts instead of ActiveAdmin

It generates code that follows these patterns without prompt engineering — the conventions are baked into the weights.

Usage with Ollama

# Download and run
ollama run bytecodehr/qwen3-coder-30b-rails

# Example prompt
ollama run bytecodehr/qwen3-coder-30b-rails "Write a Rails controller for managing user subscriptions with state transitions"

Memory requirements

Format	GGUF Size	Min RAM	Recommended
Q5_K_M	21.7 GB	24 GB	32 GB
Q4_K_M	18.6 GB	20 GB	24 GB

Rule of thumb: GGUF file size + 2–4 GB for KV cache and overhead.

Training

Trained with LoRA (rank 16, alpha 16) on attention projection layers (q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj). Only 0.78% of parameters were trained.

The dataset pipeline:

Extracted code from 45 Rails repos (35 private + 10 open-source)
15-step cleaning and deduplication pipeline
111K final training samples
Includes 29 contrastive pairs (wrong way vs right way)
Source diversity cap at 20% per repository

Full details in our blog posts:

Why Ruby for LLMs?

Ruby uses 42–45% fewer tokens than TypeScript across every major LLM tokenizer. That means more code fits in the context window, generations are faster, and costs are lower. Read our analysis: Why Ruby Is the Better Language for LLM-Powered Development.

Other models

bytecodehr/qwen3-8b-rails — 8B dense model, runs on laptops (5 GB)
bytecodehr/qwen2.5-coder-7b-rails — 7B LoRA adapter
bytecodehr/qwen2.5-coder-3b-rails — 3B LoRA adapter