How to use from the
Use from the
llama-cpp-python library
# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="OmnipotentFool/Aurvion",
	filename="craft_output\CRAFT_Q4_K_M.gguf",
)
llm.create_chat_completion(
	messages = [
		{
			"role": "user",
			"content": "What is the capital of France?"
		}
	]
)

CRAFT-Phi3-Mini

CRAFT (Curriculum-guided Reinforced Adaptive Fine-Tuning) is a reasoning-enhanced version of Phi-3-Mini, trained to address three specific failure modes of reinforcement learning applied to small language models: training instability, unreliable reward signals, and outcome-blind learning.

Built for Samsung EnnovateX 2026, Problem Statement 06.

Model Details

  • Base model: microsoft/Phi-3-mini-4k-instruct
  • Training method: SFT warmup + GRPO with three custom components
  • Format: GGUF, 4-bit quantized (Q4_K_M)
  • Size: ~2.2GB
  • License: Apache 2.0

How CRAFT Was Trained

  1. Capability Probe — measured base model difficulty per-question before training
  2. SFT Warmup — QLoRA fine-tuning on GSM8K + AQuA-RAT
  3. CRAFT RL Loop — GRPO with:
    • Deterministic execution verifier (Python-based math reward, NLI-based logic reward)
    • Contrastive step-level preference learning (self-supervised, no human labels)
    • Live adaptive curriculum (dynamic difficulty + KL control)
  4. Deployment — quantized to 4-bit GGUF for on-device inference

Benchmark Results

Benchmark Baseline (Phi-3-Mini) CRAFT Improvement
GSM8K 48% 62% 69.05%
StrategyQA 42% 71% 70.27%
MMLU 37% 63% 29.17%

Evaluated using lm-evaluation-harness.

Usage

pip install llama-cpp-python
from llama_cpp import Llama

llm = Llama(model_path="CRAFT_Q4_K_M.gguf", n_ctx=2048)
output = llm("Solve step by step: What is 15% of 240?", max_tokens=256)
print(output["choices"][0]["text"])

Intended Use

On-device reasoning for resource-constrained environments — laptops, edge devices, and offline applications requiring multi-step mathematical and logical reasoning without cloud dependency.

Limitations

[Be honest here — e.g., "Performance gains are most pronounced on arithmetic reasoning tasks; gains on broader knowledge benchmarks (MMLU) are comparatively smaller, reflecting the training data composition."]

Citation / Acknowledgment

Built for Samsung EnnovateX 2026 Hackathon, Problem Statement 06. Base model: Microsoft Phi-3-Mini.

Repository

Full source code, training pipeline, and documentation: GitHub link

Downloads last month
10
GGUF
Model size
4B params
Architecture
phi3
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for OmnipotentFool/Aurvion

Quantized
(168)
this model

Datasets used to train OmnipotentFool/Aurvion