Fix model card: correct conventions (Devise, Sidekiq)

0d29684 verified 13 days ago

3.37 kB

	---
	language:
	- en
	license: apache-2.0
	library_name: transformers
	tags:
	- ruby
	- rails
	- code-generation
	- gguf
	- fine-tuned
	- lora
	- unsloth
	pipeline_tag: text-generation
	base_model: Qwen/Qwen3-Coder-30B-A3B-Instruct
	model-index:
	- name: qwen3-coder-30b-rails
	results: []
	---

	# qwen3-coder-30b-rails

	A 31B parameter Mixture-of-Experts model fine-tuned for Ruby on Rails code generation. Trained on 111,000 samples extracted from our own internal Rails projects.

	Built by [Bytecode](https://bytecode.hr).

	## Model Details

	\| Property \| Value \|
	\|---\|---\|
	\| Base model \| [Qwen3-Coder-30B-A3B-Instruct](https://huggingface.co/Qwen/Qwen3-Coder-30B-A3B-Instruct) \|
	\| Architecture \| Qwen3 MoE (31B total, 3B active) \|
	\| Training method \| QLoRA (rank 16) via [Unsloth](https://github.com/unslothai/unsloth) \|
	\| Training data \| 111K samples from internal Rails projects \|
	\| Training cost \| ~$32 (A100 80GB, ~26 hours) \|
	\| Quantization \| GGUF Q4_K_M (18.6 GB), Q5_K_M (21.7 GB) \|

	## What it does

	This model writes idiomatic Ruby on Rails code following specific conventions:

	- Devise authentication
	- Namespaced concerns instead of service objects
	- Sidekiq instead of Solid Queue
	- State-as-records instead of boolean flags
	- DaisyUI drawer layouts instead of ActiveAdmin

	It generates code that follows these patterns without prompt engineering — the conventions are baked into the weights.

	## Usage with Ollama

	```bash
	# Download and run
	ollama run bytecodehr/qwen3-coder-30b-rails

	# Example prompt
	ollama run bytecodehr/qwen3-coder-30b-rails "Write a Rails controller for managing user subscriptions with state transitions"
	```

	### Memory requirements

	\| Format \| GGUF Size \| Min RAM \| Recommended \|
	\|---\|---\|---\|---\|
	\| Q5_K_M \| 21.7 GB \| 24 GB \| 32 GB \|
	\| Q4_K_M \| 18.6 GB \| 20 GB \| 24 GB \|

	Rule of thumb: GGUF file size + 2–4 GB for KV cache and overhead.

	## Training

	Trained with LoRA (rank 16, alpha 16) on attention projection layers (`q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj`). Only 0.78% of parameters were trained.

	The dataset pipeline:
	1. Extracted code from our internal Rails projects
	2. 15-step cleaning and deduplication pipeline
	3. 111K final training samples
	4. Includes 29 contrastive pairs (wrong way vs right way)
	5. Source diversity cap at 20% per repository

	Full details in our blog posts:
	- [Part 1: Dataset Engineering](https://bytecode.hr/posts/training-rails-llms-part-1-dataset-engineering)
	- [Part 2: Training, Quantization, and Deployment](https://bytecode.hr/posts/training-rails-llms-part-2-training-quantization-deployment)

	## Why Ruby for LLMs?

	Ruby uses 42–45% fewer tokens than TypeScript across every major LLM tokenizer. That means more code fits in the context window, generations are faster, and costs are lower. Read our analysis: [Why Ruby Is the Better Language for LLM-Powered Development](https://bytecode.hr/posts/why-ruby-is-the-better-language-for-llm-powered-development).

	## Other models

	- [bytecodehr/qwen3-8b-rails](https://huggingface.co/bytecodehr/qwen3-8b-rails) — 8B dense model, runs on laptops (5 GB)
	- [bytecodehr/qwen2.5-coder-7b-rails](https://huggingface.co/bytecodehr/qwen2.5-coder-7b-rails) — 7B LoRA adapter
	- [bytecodehr/qwen2.5-coder-3b-rails](https://huggingface.co/bytecodehr/qwen2.5-coder-3b-rails) — 3B LoRA adapter