|
|
--- |
|
|
license: apache-2.0 |
|
|
base_model: mistralai/Ministral-3-8B-Instruct-2512 |
|
|
tags: |
|
|
- mistral |
|
|
- tool-calling |
|
|
- voice-assistant |
|
|
- gguf |
|
|
- lora |
|
|
language: |
|
|
- en |
|
|
pipeline_tag: text-generation |
|
|
--- |
|
|
|
|
|
# CAAL Ministral - Fine-tuned for Tool Calling |
|
|
|
|
|
Fine-tuned [Ministral-3-8B](https://huggingface.co/mistralai/Ministral-3-8B-Instruct-2512) for accurate tool calling in CAAL voice assistant. |
|
|
|
|
|
## Results |
|
|
|
|
|
- β
**100% tool-calling accuracy** (15/15 validation cases) |
|
|
- β
**0% hallucinated answers** |
|
|
- β
Matches 14b performance at 8b speed |
|
|
- β
5.2GB Q4_K_M quantization |
|
|
|
|
|
## Quick Start (Ollama) |
|
|
|
|
|
```bash |
|
|
# Download model |
|
|
huggingface-cli download CoreWorxLab/caal-ministral \ |
|
|
caal-ministral.gguf \ |
|
|
--local-dir . |
|
|
|
|
|
# Create Modelfile |
|
|
cat > Modelfile << 'MODELFILE' |
|
|
FROM ./caal-ministral.gguf |
|
|
|
|
|
PARSER ministral |
|
|
PARAMETER temperature 0.1 |
|
|
PARAMETER num_ctx 4096 |
|
|
|
|
|
SYSTEM """You are CAAL, a witty, action-oriented voice assistant.""" |
|
|
MODELFILE |
|
|
|
|
|
# Import to Ollama |
|
|
ollama create caal-ministral -f Modelfile |
|
|
|
|
|
# Test |
|
|
ollama run caal-ministral |
|
|
``` |
|
|
|
|
|
## Training Details |
|
|
|
|
|
- **Base Model:** Ministral-3-8B-Instruct-2512 (4-bit) |
|
|
- **Method:** LoRA (r=16, alpha=16) |
|
|
- **Dataset:** 2,776 examples (tool calls, general knowledge, web search) |
|
|
- **Tool Format:** REST-style with action parameter (e.g., `espn_epl(action="scores")`) |
|
|
- **Training:** 3 epochs on RTX 3060 12GB |
|
|
- **Final Loss:** 0.126 |
|
|
|
|
|
## Performance Comparison |
|
|
|
|
|
| Metric | Base 8B | Base 14B | Fine-tuned 8B | |
|
|
|--------|---------|----------|---------------| |
|
|
| Tool calling accuracy | ~80% | ~100% | **100%** | |
|
|
| Hallucinated answers | ~20% | ~0% | **0%** | |
|
|
| Speed | Fast | Slow | **Fast** | |
|
|
| VRAM (with TTS) | 6GB | 14GB | **6GB** | |
|
|
|
|
|
## Use Cases |
|
|
|
|
|
Voice assistant tool calling: |
|
|
- Smart home control (Home Assistant, TrueNAS) |
|
|
- Calendar/task management (Google, Notion) |
|
|
- Sports scores and schedules (ESPN) |
|
|
- Server status monitoring |
|
|
- Web search for current events |
|
|
|
|
|
## Validation Examples |
|
|
|
|
|
**Successful tool calls (REST-style with action parameter):** |
|
|
- "when is the next f1 race" β `espn_f1(action="schedule")` |
|
|
- "check my truenas status" β `truenas(action="status")` |
|
|
- "add a notion task to pack my bag tomorrow" β `notion(action="add", task="pack my bag", due="tomorrow")` |
|
|
- "Premier League scores" β `espn_epl(action="scores")` |
|
|
|
|
|
**General knowledge (no tool):** |
|
|
- "what's the capital of France" β "Paris" |
|
|
|
|
|
**Web search:** |
|
|
- "Who is playing at the 2026 half-time show?" β `web_search(query="2026 Super Bowl halftime show lineup")` |
|
|
|
|
|
## Quantization Path |
|
|
|
|
|
``` |
|
|
Training: 4-bit bnb (fits 12GB VRAM) |
|
|
β |
|
|
Export: LoRA β GGUF |
|
|
β |
|
|
Merge: Q4_K_M base + LoRA β F16 |
|
|
β |
|
|
Quantize: F16 β Q4_K_M (single clean quantization) |
|
|
``` |
|
|
|
|
|
## Limitations |
|
|
|
|
|
- Trained on REST-style tool format with action parameters |
|
|
- Requires proper tool descriptions in system prompt |
|
|
- Low temperature (0.1) recommended for deterministic behavior |
|
|
- Designed for voice assistant use cases |
|
|
|
|
|
## Hardware Requirements |
|
|
|
|
|
**Inference:** |
|
|
- GPU: 6GB VRAM (runs alongside Kokoro TTS on 12GB card) |
|
|
- CPU: Compatible but slower |
|
|
- RAM: 8GB minimum |
|
|
|
|
|
## License |
|
|
|
|
|
Apache 2.0 (matches base model) |
|
|
|
|
|
## Citation |
|
|
|
|
|
```bibtex |
|
|
@misc{caal-ministral-2026, |
|
|
author = {CoreWorxLab}, |
|
|
title = {CAAL Ministral: Fine-tuned Tool Calling Model}, |
|
|
year = {2026}, |
|
|
publisher = {Hugging Face}, |
|
|
url = {https://huggingface.co/CoreWorxLab/caal-ministral} |
|
|
} |
|
|
``` |
|
|
|
|
|
## Links |
|
|
|
|
|
- [Base Model](https://huggingface.co/mistralai/Ministral-3-8B-Instruct-2512) |
|
|
- [CAAL Project](https://github.com/CoreWorxLab/caal) |
|
|
|
|
|
## Acknowledgments |
|
|
|
|
|
Trained using [Unsloth](https://github.com/unslothai/unsloth) for efficient LoRA fine-tuning. |
|
|
|