🦊 Fox 1.5 Nova

Fox 1.5 Nova is Teo's code generation model, fine-tuned for competitive programming, systems design, and real-world code patterns across 50+ languages.

🏆 Comparison

Metric 🦊 Fox 1.5 Nova (Qwen3B) Claude Mythos
Parameters ~3B ~200B+
Speed ~2.6 tok/s (4-bit) N/A (API only)
Size 2GB (4-bit) / 5.8GB (fp16) ~80GB
RAM Required ~8GB ~256GB
VRAM Required ~4GB N/A
Cost Free $5-25 / 1M tokens
Runs on CPU ✅ Yes ❌ No
Internet Required ❌ No ✅ Yes

📊 Benchmark Results

Test Case Tokens Time Speed
Prime checker 52 20.5s 2.5 tok/s
Binary search 88 33.3s 2.6 tok/s
Stack class 45 17.1s 2.6 tok/s
Quicksort 84 31.8s 2.6 tok/s
Fibonacci DP 72 27.5s 2.6 tok/s
Average - - 2.6 tok/s

Code Quality Examples

Prime checker output:

def is_prime(n):
    if n < 2:
        return False
    for i in range(2, int(n**0.5) + 1):
        if n % i == 0:
            return False
    return True

Binary search output:

def binary_search(arr, target):
    left, right = 0, len(arr) - 1
    while left <= right:
        mid = (left + right) // 2
        if arr[mid] == target:
            return mid
        elif arr[mid] < target:
            left = mid + 1
        else:
            right = mid - 1
    return -1

📊 Specs

Metric Value
Base Model Qwen2.5-3B-Instruct
Fine-tune Method QLoRA (4-bit NF4)
LoRA r 16
LoRA alpha 32
Max Length 1024 tokens
Trainable Params ~30M
Training Steps 250
Epochs 10

💻 Hardware

  • Training: NVIDIA RTX 3050 (6GB VRAM) via QLoRA + Unsloth
  • Inference: ~4GB VRAM (4-bit) or 8GB+ RAM

🚀 Usage

from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig

model_name = "teolm30/Fox-1.5-Nova"
bnb_config = BitsAndBytesConfig(load_in_4bit=True, bnb_4bit_quant_type="nf4")
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_name, quantization_config=bnb_config, device_map="auto")

prompt = "Write a Python LRU cache"
messages = [{"role": "user", "content": prompt}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=512, do_sample=True, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

⚠️ Limitations

  • Speed limited to ~2.6 tok/s on 4-bit (faster at fp16 with more VRAM)
  • Smaller 3B model — optimized for local deployment on modest hardware
  • For larger 7B model, see teolm30/Fox-1.5-Nova-7B
  • No built-in tool-use (use OpenClaw agent framework)

🔗 Links


🦊 Built by FoxModelClaw agent for Teo's FoxOS development.

Downloads last month
1,602
Safetensors
Model size
4B params
Tensor type
F32
·
F16
·
U8
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for teolm30/Fox-1.5-Nova

Base model

Qwen/Qwen2.5-3B
Adapter
(1262)
this model