Hopcoder-Mini-9B / README.md
TaimoorSiddiqui's picture
Fix frontmatter: remove datasets, add model-index with metrics, fix HF YAML warning
f143f8d
|
Raw
History Blame Contribute Delete
3.45 kB
---
license: apache-2.0
base_model: empero-ai/Qwythos-9B-Claude-Mythos-5-1M
language:
- en
library_name: transformers
pipeline_tag: text-generation
tags:
- qwen3.5
- reasoning
- long-context
- 1M-context
- function-calling
- tool-use
- sft
- full-fine-tune
- agentic
- conversational
- multimodal
- vision
model-index:
- name: Hopcoder-Mini-9B
results:
- task:
type: text-generation
label: Text Generation
dataset:
name: Unknown
type: generic
metrics:
- type: custom
value: TBD
---
# Hopcoder-Mini-9B
**Hopcoder-Mini-9B** is a compact 9B-parameter reasoning model with a **1,048,576-token context window** (YaRN rope-scaling enabled by default), native function calling, and strong chain-of-thought performance.
## Highlights
- **1M-token context** out of the box via YaRN.
- **Native Qwen3.5-style function calling** — no wrapper needed.
- **Self-corrects with tools** — emits source-cited, factually grounded answers when given a Python executor and web search.
- Built on a Qwen3.5-9B base (via empero-ai/Qwythos-9B-Claude-Mythos-5-1M), full-parameter fine-tuned on high-quality reasoning traces.
## Architecture
| Field | Value |
|---|---|
| Architecture | Qwen3_5ForConditionalGeneration |
| Model type | qwen3_5 (text + vision) |
| Parameters | ~9B |
| Hidden size | 4096 |
| Layers | 32 (hybrid linear / full attention) |
| Attention heads | 16 |
| KV heads | 4 |
| Vocab size | 248,320 |
| Max context | 1,048,576 tokens |
| Precision | bfloat16 |
## Requirements
- `transformers >= 5.12.1` (required for `qwen3_5` model type)
- `torch >= 2.1`
- `trust_remote_code=True` when loading
## Usage
### Text-only input
```python
import torch
from transformers import AutoModelForImageTextToText, AutoProcessor
model = AutoModelForImageTextToText.from_pretrained(
"TaimoorSiddiqui/Hopcoder-Mini-9B",
dtype=torch.bfloat16,
device_map="auto",
trust_remote_code=True,
)
processor = AutoProcessor.from_pretrained(
"TaimoorSiddiqui/Hopcoder-Mini-9B",
trust_remote_code=True,
)
messages = [
{"role": "user", "content": "What is 2+2?"},
]
text = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = processor(text=text, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=512)
print(processor.decode(out[0], skip_special_tokens=True))
```
### Vision input
```python
from transformers import AutoModelForImageTextToText, AutoProcessor
from PIL import Image
model = AutoModelForImageTextToText.from_pretrained(
"TaimoorSiddiqui/Hopcoder-Mini-9B",
dtype=torch.bfloat16,
device_map="auto",
trust_remote_code=True,
)
processor = AutoProcessor.from_pretrained(
"TaimoorSiddiqui/Hopcoder-Mini-9B",
trust_remote_code=True,
)
image = Image.open("example.jpg")
messages = [
{"role": "user", "content": [
{"type": "image", "image": image},
{"type": "text", "text": "Describe this image."},
]},
]
text = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = processor(text=text, images=image, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=512)
print(processor.decode(out[0], skip_special_tokens=True))
```
Sampling: `temperature=0.6, top_p=0.95, top_k=20` (Qwen3.5 defaults).
## License
Apache 2.0.