--- library_name: transformers license: apache-2.0 license_link: LICENSE pipeline_tag: image-text-to-text base_model: - Qwen/Qwen3.5-4B tags: - verus - coding - multimodal - vision - 262k-context language: - en --- # Verus-4B [![License: Apache 2.0](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](LICENSE) [![Model Size](https://img.shields.io/badge/Parameters-4B-brightgreen)]() [![Context](https://img.shields.io/badge/Context-262K%20tokens-orange)]() [![HF Transformers](https://img.shields.io/badge/Transformers-%E2%89%A54.52-red)](https://github.com/huggingface/transformers) > [!Note] > This repository contains model weights and configuration files for **Verus-4B** in the Hugging Face Transformers format. > > Compatible with Hugging Face Transformers, vLLM, SGLang, and other major inference frameworks. > > Primary intended use cases are **code generation**, **code review**, **debugging**, and **general coding assistance**. ## Verus-4B Highlights - **Coding-First**: Fine-tuned specifically on high-quality coding datasets — handles everything from simple scripts to complex multi-file implementations cleanly. - **Image + Text Input**: Accepts both images and text, allowing you to describe UIs, diagrams, or screenshots alongside code questions. - **262K Token Context Window**: Process entire codebases, long specifications, or lengthy conversations in a single pass. - **Strong Instruction Following**: Stays focused, responds clearly, and redirects to the task at hand. - **Efficient**: At 4B parameters in bfloat16, runs comfortably on a single consumer GPU with 8GB+ VRAM. ## Model Overview | Property | Value | |---|---| | Parameters | ~4B | | Context Length | **262,144 tokens** | | Architecture | Qwen3.5 | | Chat Format | ChatML (`<\|im_start\|>` / `<\|im_end\|>`) | | Dtype | bfloat16 | | License | Apache 2.0 | ## Quickstart ### Installation ```bash pip install "transformers>=4.52.0" accelerate torch ``` ### Code Generation ```python from transformers import AutoTokenizer, AutoModelForCausalLM import torch MODEL_ID = "8F-ai/Verus-4B" tokenizer = AutoTokenizer.from_pretrained(MODEL_ID) model = AutoModelForCausalLM.from_pretrained( MODEL_ID, torch_dtype=torch.bfloat16, device_map="auto", ) model.eval() messages = [ { "role": "system", "content": "You are Verus, a coding assistant made by 8F-ai. You help with coding tasks and keep responses focused and clean." }, { "role": "user", "content": "Write a Python async context manager that manages a PostgreSQL connection pool using asyncpg." } ] text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) inputs = tokenizer(text, return_tensors="pt").to(model.device) with torch.inference_mode(): generated_ids = model.generate(**inputs, max_new_tokens=2048, temperature=0.1, top_p=0.95) output = tokenizer.decode(generated_ids[0][len(inputs.input_ids[0]):], skip_special_tokens=True) print(output) ``` ### Image + Text Input ```python from transformers import AutoTokenizer, AutoModelForCausalLM import torch MODEL_ID = "8F-ai/Verus-4B" tokenizer = AutoTokenizer.from_pretrained(MODEL_ID) model = AutoModelForCausalLM.from_pretrained( MODEL_ID, torch_dtype=torch.bfloat16, device_map="auto", ) model.eval() messages = [ { "role": "user", "content": [ {"type": "image", "image": "path/to/screenshot.png"}, {"type": "text", "text": "Convert this UI screenshot into a React component using Tailwind CSS."} ] } ] text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) inputs = tokenizer(text, return_tensors="pt").to(model.device) with torch.inference_mode(): generated_ids = model.generate(**inputs, max_new_tokens=2048, temperature=0.1, top_p=0.95) output = tokenizer.decode(generated_ids[0][len(inputs.input_ids[0]):], skip_special_tokens=True) print(output) ``` ### Quantized Inference (4-bit NF4, ~4 GB VRAM) ```python from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig import torch quantization_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_compute_dtype=torch.bfloat16, bnb_4bit_use_double_quant=True, bnb_4bit_quant_type="nf4", ) tokenizer = AutoTokenizer.from_pretrained("8F-ai/Verus-4B") model = AutoModelForCausalLM.from_pretrained( "8F-ai/Verus-4B", quantization_config=quantization_config, device_map="auto", ) ``` ## Intended Use Cases | Use Case | Example | |---|---| | **Code Generation** | Write functions, classes, scripts in any language | | **Debugging** | Identify and fix bugs from error messages or code | | **Code Review** | Suggest improvements, catch issues, explain code | | **UI to Code** | Convert screenshots or diagrams into working code | | **Long Context Codebase** | Reason over entire repos up to ~200K tokens | | **General Q&A** | Answer programming questions clearly and concisely | ## Limitations - **English-Primary**: Fine-tuning was conducted predominantly on English-language code and documentation. - **Not for Math/Science**: Not optimized for mathematical proofs or scientific computation. ## Citation ```bibtex @misc{verus4b2026, title = {Verus-4B: A Coding-Focused Multimodal Language Model with 262K Context}, author = {8F-ai}, year = {2026}, howpublished = {\url{https://huggingface.co/8F-ai/Verus-4B}}, note = {Apache 2.0 License} } ``` ## License Verus-4B is released under the **Apache License 2.0**. See [LICENSE](LICENSE) for full terms. Derived from [Qwen/Qwen3.5-4B](https://huggingface.co/Qwen/Qwen3.5-4B) (Apache 2.0). ---
Built with ❤️ by the 8F-ai Team