CoreWorxLab
/

caal-ministral

+---
+license: apache-2.0
+base_model: mistralai/Ministral-3-8B-Instruct-2512
+tags:
+- mistral
+- tool-calling
+- voice-assistant
+- gguf
+- lora
+language:
+- en
+pipeline_tag: text-generation
+---
+# CAAL Ministral - Fine-tuned for Tool Calling
+Fine-tuned [Ministral-3-8B](https://huggingface.co/mistralai/Ministral-3-8B-Instruct-2512) for accurate tool calling in CAAL voice assistant.
+## Results
+- ✅ **100% tool-calling accuracy** (15/15 validation cases)
+- ✅ **0% hallucinated answers**
+- ✅ Matches 14b performance at 8b speed
+- ✅ 5.2GB Q4_K_M quantization
+## Quick Start (Ollama)
+```bash
+# Download model
+huggingface-cli download CoreWorxLab/caal-ministral \
+  caal-ministral-Q4_K_M.gguf \
+  --local-dir .
+# Create Modelfile
+cat > Modelfile << 'MODELFILE'
+FROM ./caal-ministral-Q4_K_M.gguf
+PARSER ministral
+PARAMETER temperature 0.1
+PARAMETER num_ctx 4096
+SYSTEM """You are CAAL, a witty, action-oriented voice assistant."""
+MODELFILE
+# Import to Ollama
+ollama create caal-ministral -f Modelfile
+# Test
+ollama run caal-ministral
+```
+## Training Details
+- **Base Model:** Ministral-3-8B-Instruct-2512 (4-bit)
+- **Method:** LoRA (r=16, alpha=16)
+- **Dataset:** 2,776 examples (tool calls, general knowledge, web search)
+- **Tool Format:** REST-style with action parameter (e.g., `espn_epl(action="scores")`)
+- **Training:** 3 epochs on RTX 3060 12GB
+- **Final Loss:** 0.126
+## Performance Comparison
+| Metric | Base 8B | Base 14B | Fine-tuned 8B |
+|--------|---------|----------|---------------|
+| Tool calling accuracy | ~80% | ~100% | **100%** |
+| Hallucinated answers | ~20% | ~0% | **0%** |
+| Speed | Fast | Slow | **Fast** |
+| VRAM (with TTS) | 6GB | 14GB | **6GB** |
+## Use Cases
+Voice assistant tool calling:
+- Smart home control (Home Assistant, TrueNAS)
+- Calendar/task management (Google, Notion)
+- Sports scores and schedules (ESPN)
+- Server status monitoring
+- Web search for current events
+## Validation Examples
+**Successful tool calls (REST-style with action parameter):**
+- "when is the next f1 race" → `espn_f1(action="schedule")`
+- "check my truenas status" → `truenas(action="status")`
+- "add a notion task to pack my bag tomorrow" → `notion(action="add", task="pack my bag", due="tomorrow")`
+- "Premier League scores" → `espn_epl(action="scores")`
+**General knowledge (no tool):**
+- "what's the capital of France" → "Paris"
+**Web search:**
+- "Who is playing at the 2026 half-time show?" → `web_search(query="2026 Super Bowl halftime show lineup")`
+## Quantization Path
+```
+Training:   4-bit bnb (fits 12GB VRAM)
+            ↓
+Export:     LoRA → GGUF
+            ↓
+Merge:      Q4_K_M base + LoRA → F16
+            ↓
+Quantize:   F16 → Q4_K_M (single clean quantization)
+```
+## Limitations
+- Trained on REST-style tool format with action parameters
+- Requires proper tool descriptions in system prompt
+- Low temperature (0.1) recommended for deterministic behavior
+- Designed for voice assistant use cases
+## Hardware Requirements
+**Inference:**
+- GPU: 6GB VRAM (runs alongside Kokoro TTS on 12GB card)
+- CPU: Compatible but slower
+- RAM: 8GB minimum
+## License
+Apache 2.0 (matches base model)
+## Citation
+```bibtex
+@misc{caal-ministral-2026,
+  author = {CoreWorxLab},
+  title = {CAAL Ministral: Fine-tuned Tool Calling Model},
+  year = {2026},
+  publisher = {Hugging Face},
+  url = {https://huggingface.co/CoreWorxLab/caal-ministral}
+}
+```
+## Links
+- [Base Model](https://huggingface.co/mistralai/Ministral-3-8B-Instruct-2512)
+- [CAAL Project](https://github.com/CoreWorxLab/caal)
+## Acknowledgments
+Trained using [Unsloth](https://github.com/unslothai/unsloth) for efficient LoRA fine-tuning.