Caesar 1.0

Our first complete training cycle, end to end. SFT followed by GRPO policy optimization. ~1B parameters of weights we own.

Caesar isn't a fine-tune of someone else's model. It's a full training run we executed on our own pipeline — supervised fine-tuning on tool-calling data, then GRPO over 600+ iterations to optimize the policy. The point isn't that 1B beats the frontier on benchmarks. It's that we control every byte of how it got there.

Model details

Property Value
Developer AXE Technologies
Training Custom SFT + GRPO (600+ iterations)
Parameters ~1B
Context 8K tokens
Quantization GGUF
License Apache 2.0

What it's tuned for

  • Fast edge inference on minimal hardware
  • Tool calling and function dispatch
  • Offline operation — no cloud dependency for any inference step
  • Demonstrating end-to-end training capability on Apple Silicon

Usage

ollama pull axetechnologies/caesar-1.0

The AXE family

Five models, each tuned for a different lane in the inference pipeline:

Model Lane What it does
Casanova 1.2 Agency Tool-calling, multi-step workflows. 27B dense.
Geralt 1.3 Reasoning at scale 26B parameters of capability, 4B of inference cost. MoE.
Pegasus 1.0 Visible work Chain-of-thought you can audit. 12B dense.
Artemis 1.0 Speed Loads in seconds. 4B for edge hardware.
Caesar 1.0 First principles Our own training cycle. ~1B, end-to-end on our pipeline.

About AXE Technologies

Canadian in-house AI infrastructure. Built on Apple Silicon. The models run on hardware you can audit — no cloud dependency, no third-party model in the data path.

Website: axetechnologies.ca

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including axetechnologies/caesar-1.0-GGUF