Hopcoder-Mini-9B / README.md
TaimoorSiddiqui's picture
Fix frontmatter: remove datasets, add model-index with metrics, fix HF YAML warning
f143f8d
|
Raw
History Blame Contribute Delete
3.45 kB
metadata
license: apache-2.0
base_model: empero-ai/Qwythos-9B-Claude-Mythos-5-1M
language:
  - en
library_name: transformers
pipeline_tag: text-generation
tags:
  - qwen3.5
  - reasoning
  - long-context
  - 1M-context
  - function-calling
  - tool-use
  - sft
  - full-fine-tune
  - agentic
  - conversational
  - multimodal
  - vision
model-index:
  - name: Hopcoder-Mini-9B
    results:
      - task:
          type: text-generation
          label: Text Generation
        dataset:
          name: Unknown
          type: generic
        metrics:
          - type: custom
            value: TBD

Hopcoder-Mini-9B

Hopcoder-Mini-9B is a compact 9B-parameter reasoning model with a 1,048,576-token context window (YaRN rope-scaling enabled by default), native function calling, and strong chain-of-thought performance.

Highlights

  • 1M-token context out of the box via YaRN.
  • Native Qwen3.5-style function calling — no wrapper needed.
  • Self-corrects with tools — emits source-cited, factually grounded answers when given a Python executor and web search.
  • Built on a Qwen3.5-9B base (via empero-ai/Qwythos-9B-Claude-Mythos-5-1M), full-parameter fine-tuned on high-quality reasoning traces.

Architecture

Field Value
Architecture Qwen3_5ForConditionalGeneration
Model type qwen3_5 (text + vision)
Parameters ~9B
Hidden size 4096
Layers 32 (hybrid linear / full attention)
Attention heads 16
KV heads 4
Vocab size 248,320
Max context 1,048,576 tokens
Precision bfloat16

Requirements

  • transformers >= 5.12.1 (required for qwen3_5 model type)
  • torch >= 2.1
  • trust_remote_code=True when loading

Usage

Text-only input

import torch
from transformers import AutoModelForImageTextToText, AutoProcessor

model = AutoModelForImageTextToText.from_pretrained(
    "TaimoorSiddiqui/Hopcoder-Mini-9B",
    dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True,
)
processor = AutoProcessor.from_pretrained(
    "TaimoorSiddiqui/Hopcoder-Mini-9B",
    trust_remote_code=True,
)

messages = [
    {"role": "user", "content": "What is 2+2?"},
]
text = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = processor(text=text, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=512)
print(processor.decode(out[0], skip_special_tokens=True))

Vision input

from transformers import AutoModelForImageTextToText, AutoProcessor
from PIL import Image

model = AutoModelForImageTextToText.from_pretrained(
    "TaimoorSiddiqui/Hopcoder-Mini-9B",
    dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True,
)
processor = AutoProcessor.from_pretrained(
    "TaimoorSiddiqui/Hopcoder-Mini-9B",
    trust_remote_code=True,
)

image = Image.open("example.jpg")
messages = [
    {"role": "user", "content": [
        {"type": "image", "image": image},
        {"type": "text", "text": "Describe this image."},
    ]},
]
text = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = processor(text=text, images=image, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=512)
print(processor.decode(out[0], skip_special_tokens=True))

Sampling: temperature=0.6, top_p=0.95, top_k=20 (Qwen3.5 defaults).

License

Apache 2.0.