--- language: - en license: apache-2.0 library_name: transformers tags: - ruby - rails - code-generation - gguf - fine-tuned - lora - unsloth pipeline_tag: text-generation base_model: Qwen/Qwen3-8B model-index: - name: qwen3-8b-rails results: [] --- # qwen3-8b-rails An 8B parameter dense model fine-tuned for **Ruby on Rails code generation**. Trained on 111,000 samples extracted from our own internal Rails projects. Small enough to run on a laptop. Built by [Bytecode](https://bytecode.hr). ## Model Details | Property | Value | |---|---| | Base model | [Qwen3-8B](https://huggingface.co/Qwen/Qwen3-8B) | | Architecture | Qwen3 dense (8B parameters) | | Training method | QLoRA (rank 16) via [Unsloth](https://github.com/unslothai/unsloth) | | Training data | 111K samples from internal Rails projects | | Training cost | ~$21 (A100 80GB, ~17 hours) | | Quantization | GGUF Q4_K_M (5.03 GB) | ## What it does This model writes idiomatic Ruby on Rails code following specific conventions: - Devise authentication - Namespaced concerns instead of service objects - Sidekiq instead of Solid Queue - State-as-records instead of boolean flags - DaisyUI drawer layouts instead of ActiveAdmin The 8B model is the lightweight option — fast enough for inline code completion, small enough to run alongside your development server without swapping. ## Usage with Ollama ```bash # Download and run ollama run bytecodehr/qwen3-8b-rails # Example prompt ollama run bytecodehr/qwen3-8b-rails "Write a Rails migration for a subscriptions table with plan, status, and billing cycle" ``` ### Memory requirements | Format | GGUF Size | Min RAM | Recommended | |---|---|---|---| | Q4_K_M | 5.03 GB | 8 GB | 16 GB | Fits comfortably on any modern laptop. GGUF file size + 2–3 GB for KV cache. ## Training Trained with LoRA (rank 16, alpha 16) on attention projection layers. Only 0.78% of parameters were trained. The full training run took ~17 hours on a single A100 80GB GPU. The dataset: 1. Our internal Rails projects 2. 15-step cleaning and deduplication pipeline 3. 111K final training samples with contrastive pairs 4. Source diversity cap at 20% per repository Full details in our blog posts: - [Part 1: Dataset Engineering](https://bytecode.hr/posts/training-rails-llms-part-1-dataset-engineering) - [Part 2: Training, Quantization, and Deployment](https://bytecode.hr/posts/training-rails-llms-part-2-training-quantization-deployment) ## Why Ruby for LLMs? Ruby uses 42–45% fewer tokens than TypeScript across every major LLM tokenizer. Fewer tokens means more code in the context window, faster generations, and lower costs. Read our analysis: [Why Ruby Is the Better Language for LLM-Powered Development](https://bytecode.hr/posts/why-ruby-is-the-better-language-for-llm-powered-development). ## Other models - [bytecodehr/qwen3-coder-30b-rails](https://huggingface.co/bytecodehr/qwen3-coder-30b-rails) — 31B MoE flagship model (18–21 GB) - [bytecodehr/qwen2.5-coder-7b-rails](https://huggingface.co/bytecodehr/qwen2.5-coder-7b-rails) — 7B LoRA adapter - [bytecodehr/qwen2.5-coder-3b-rails](https://huggingface.co/bytecodehr/qwen2.5-coder-3b-rails) — 3B LoRA adapter