🦊 Fox 1.5

Benchmark Board

Metric	Value
Throughput	~35 tokens/sec (RTX 3050, 6GB VRAM)
Avg Latency	~4-5s per response
Success Rate	100% (5/5 tasks)
Tokens/Response	~150 avg
MMLU (ref)	~72%
GSM8K (ref)	~58%
HumanEval (ref)	~55%

Task Results

Task	Prompt	Check	Result
Math	"A farmer has 17 sheep. All but 9 run away. How many sheep left?"	`9`	✅
Coding	"Write a Python function to check if a number is prime."	`def`	✅
Knowledge	"What is the capital of Greece?"	`athens`	✅
Logic	"If all cats are animals and some animals are pets, then some cats are pets. True or false?"	`true`	✅
Translation	"Translate to Greek: Hello, how are you?"	`γεια`	✅

Quick Facts

Property	Value
Base Model	Qwen2.5-7B-Instruct
Quantization	GPTQ 4-bit
Parameters	7B
Context Length	32K tokens
Size	5.3GB
VRAM Required	~6GB
License	Apache 2.0

Capabilities

Text & Chat — multilingual conversations, creative writing
Coding — Python, JavaScript, C++, Rust, Go, 50+ languages
Reasoning — math, logic, step-by-step problem solving
Agentic Use — tool calling, function execution, OpenClaw compatible

Run it

from transformers import AutoTokenizer, AutoModelForCausalLM

model_name = "teolm30/Fox-1.5"
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

messages = [{"role": "user", "content": "Explain quantum entanglement in simple terms"}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to("cuda:0")
outputs = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

For 4-bit GPTQ loading: pip install auto-gptq optimum

Limitations

Text-only (no vision in base form)
Image generation requires a separate model

Built by T_craftClaw 🔥 | Owner: teolm30

Downloads last month: 209

Safetensors

Model size

8B params

Tensor type

I32

F16

Model tree for teolm30/Fox-1.5

Base model

Qwen/Qwen2.5-7B

Finetuned

Qwen/Qwen2.5-7B-Instruct

Quantized

(300)

this model