--- license: apache-2.0 base_model: empero-ai/Qwythos-9B-Claude-Mythos-5-1M language: - en library_name: transformers pipeline_tag: text-generation tags: - qwen3.5 - reasoning - long-context - 1M-context - function-calling - tool-use - sft - full-fine-tune - agentic - conversational - multimodal - vision model-index: - name: Hopcoder-Mini-9B results: - task: type: text-generation label: Text Generation dataset: name: Unknown type: generic metrics: - type: custom value: TBD --- # Hopcoder-Mini-9B **Hopcoder-Mini-9B** is a compact 9B-parameter reasoning model with a **1,048,576-token context window** (YaRN rope-scaling enabled by default), native function calling, and strong chain-of-thought performance. ## Highlights - **1M-token context** out of the box via YaRN. - **Native Qwen3.5-style function calling** — no wrapper needed. - **Self-corrects with tools** — emits source-cited, factually grounded answers when given a Python executor and web search. - Built on a Qwen3.5-9B base (via empero-ai/Qwythos-9B-Claude-Mythos-5-1M), full-parameter fine-tuned on high-quality reasoning traces. ## Architecture | Field | Value | |---|---| | Architecture | Qwen3_5ForConditionalGeneration | | Model type | qwen3_5 (text + vision) | | Parameters | ~9B | | Hidden size | 4096 | | Layers | 32 (hybrid linear / full attention) | | Attention heads | 16 | | KV heads | 4 | | Vocab size | 248,320 | | Max context | 1,048,576 tokens | | Precision | bfloat16 | ## Requirements - `transformers >= 5.12.1` (required for `qwen3_5` model type) - `torch >= 2.1` - `trust_remote_code=True` when loading ## Usage ### Text-only input ```python import torch from transformers import AutoModelForImageTextToText, AutoProcessor model = AutoModelForImageTextToText.from_pretrained( "TaimoorSiddiqui/Hopcoder-Mini-9B", dtype=torch.bfloat16, device_map="auto", trust_remote_code=True, ) processor = AutoProcessor.from_pretrained( "TaimoorSiddiqui/Hopcoder-Mini-9B", trust_remote_code=True, ) messages = [ {"role": "user", "content": "What is 2+2?"}, ] text = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) inputs = processor(text=text, return_tensors="pt").to(model.device) out = model.generate(**inputs, max_new_tokens=512) print(processor.decode(out[0], skip_special_tokens=True)) ``` ### Vision input ```python from transformers import AutoModelForImageTextToText, AutoProcessor from PIL import Image model = AutoModelForImageTextToText.from_pretrained( "TaimoorSiddiqui/Hopcoder-Mini-9B", dtype=torch.bfloat16, device_map="auto", trust_remote_code=True, ) processor = AutoProcessor.from_pretrained( "TaimoorSiddiqui/Hopcoder-Mini-9B", trust_remote_code=True, ) image = Image.open("example.jpg") messages = [ {"role": "user", "content": [ {"type": "image", "image": image}, {"type": "text", "text": "Describe this image."}, ]}, ] text = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) inputs = processor(text=text, images=image, return_tensors="pt").to(model.device) out = model.generate(**inputs, max_new_tokens=512) print(processor.decode(out[0], skip_special_tokens=True)) ``` Sampling: `temperature=0.6, top_p=0.95, top_k=20` (Qwen3.5 defaults). ## License Apache 2.0.