Quantizations of https://huggingface.co/arcee-ai/Arcee-Blitz

Open source inference clients/UIs

Closed source inference clients/UIs


From original readme

Arcee-Blitz (24B) is a new Mistral-based 24B model distilled from DeepSeek, designed to be both fast and efficient. We view it as a practical “workhorse” model that can tackle a range of tasks without the overhead of larger architectures.

Model Details

  • Architecture Base: Mistral-Small-24B-Instruct-2501
  • Parameter Count: 24B
  • Distillation Data:
    • Merged Virtuoso pipeline with Mistral architecture, hotstarting the training with over 3B tokens of pretraining distillation from DeepSeek-V3 logits
  • Fine-Tuning and Post-Training:
    • After capturing core logits, we performed additional fine-tuning and distillation steps to enhance overall performance.
  • License: Apache-2.0

Improving World Knowledge

Arcee-Blitz shows large improvements to performance on MMLU-Pro versus the original Mistral-Small-3, reflecting a dramatic increase in world knowledge.

Data contamination checking

We carefully examined our training data and pipeline to avoid contamination. While we’re confident in the validity of these gains, we remain open to further community validation and testing (one of the key reasons we release these models as open-source).

Benchmark Comparison

Benchmark mistral‑small‑3 arcee‑blitz
MixEval 81.6% 85.1%
GPQADiamond 42.4% 43.1%
BigCodeBench Complete 44.4% 45.5%
BigCodeBench Instruct 34.7% 35.9%
BigCodeBench Complete-hard 16.2% 19.6%
BigCodeBench Instruct-hard 15.5% 15.5%
IFEval 77.44 80.60
BBH 64.46 65.00
GPQA 33.90 36.70
MMLU Pro 44.70 60.20
MuSR 40.90 50.00
Math Level 5 12.00 38.60

Limitations

  • Context Length: 32k Tokens (may vary depending on the final tokenizer settings and system resources).
  • Knowledge Cut-off: Training data may not reflect the latest events or developments beyond June 2024.
Downloads last month
122
GGUF
Model size
24B params
Architecture
llama
Hardware compatibility
Log In to view the estimation

1-bit

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support