Prevolut
/

Chapper-MCP-Vision-Slim-IT

@@ -1,3 +1,4 @@
 ---
 license: apache-2.0
 base_model: google/gemma-4-E2B-it
@@ -12,9 +13,10 @@ tags:
   - chapper
   - ios-client
   - tools
   - lm-studio
   - prevolut
-pipeline_tag: text-generation
 language:
   - multilingual
   - en
@@ -39,15 +41,24 @@ datasets:
 Developed by **Prevolut Ltd**, this model serves as the local intelligence engine powering **[Chapper – AI & LM Studio Client](https://apps.apple.com/de/app/chapper-ai-lm-studio-client/id6760984679)**, a native iOS application designed for on-device or server, privacy-first LLM inference.
-While purposefully built to drive the Chapper ecosystem, its strong logical foundation makes it a highly capable, lightweight agent for any general-purpose application requiring strict JSON tool-use and multimodal analysis.
-## 🌍 Multilingual Capabilities
-Inheriting the massive linguistic foundation of its base architecture, this model is fluent in **over 100+ languages**. Whether processing inputs or generating complex JSON structures, it maintains high logical fidelity across English, German, French, Spanish, Italian, Dutch, Mandarin, Japanese, Korean, and many more.
-## 🚀 Why this model?
-Running fully autonomous, vision-capable agents on mobile hardware requires extreme efficiency. We needed a model that understands complex UI screenshots, follows strict JSON formatting rules, and retains general reasoning—all without sacrificing device performance or battery life.
-By utilizing advanced quantization techniques (averaging ~6.8 bits with 4-bit text layers and 16-bit vision layers via MLX), this model achieves desktop-grade tool-use natively on mobile edge devices.
 ## 📚 Training Data & Mix
 To achieve the perfect balance between strict syntax discipline and dynamic logic, we curated a massive, multi-tiered dataset:
@@ -58,7 +69,9 @@ To achieve the perfect balance between strict syntax discipline and dynamic logi
 ## ⚡️ Inference & Prompt Format
-This model strictly follows the standard Gemma IT prompt template. To utilize its vision capabilities and MCP formatting, ensure your inputs are structured correctly:
 ```xml
 <start_of_turn>user
@@ -66,11 +79,11 @@ Analyze this UI screenshot and format the action as a valid Chapper MCP request.
 <start_of_turn>assistant
 ```
-## 💻 Usage
 Designed for edge inference, this model shines on Apple Silicon (macOS/iOS) and within fast local environments.
-### 🣱 Natively on iOS via Apple MLX
 We highly recommend running this via Apple's `mlx-swift` / `mlx-vlm` libraries for direct Neural Engine & GPU acceleration on iPhones and Macs:
 ```swift
@@ -91,4 +104,9 @@ print(result.text) // Outputs perfect <mcp-request> syntax!
 For `.gguf` variants, the model can be natively loaded into LM Studio. **Crucial:** To enable vision capabilities, you must load the accompanying `-mmproj.gguf` Vision Adapter in the hardware settings alongside the main model.
 ## ⚖️ License
-This model is released under the **Apache 2.0 License**, inheriting the open and permissive nature of its base architecture.

+```markdown
 ---
 license: apache-2.0
 base_model: google/gemma-4-E2B-it
   - chapper
   - ios-client
   - tools
+  - tool-use
   - lm-studio
   - prevolut
+pipeline_tag: image-text-to-text
 language:
   - multilingual
   - en
 Developed by **Prevolut Ltd**, this model serves as the local intelligence engine powering **[Chapper – AI & LM Studio Client](https://apps.apple.com/de/app/chapper-ai-lm-studio-client/id6760984679)**, a native iOS application designed for on-device or server, privacy-first LLM inference.
+We engineered this model to bridge the gap between lightweight edge-computing and advanced structural reasoning. While purposefully built to drive the Chapper ecosystem, its strict adherence to JSON formatting and robust logical foundation makes it a highly capable agent for any general-purpose application requiring complex tool orchestration and multimodal analysis.
+## 🎯 Key Features & Enhancements
+* **Socratic Reasoning Engine:** Instead of guessing answers, the model is trained to break down complex, multi-stage system problems step-by-step, running internal plausibility checks before outputting the final result.
+* **Format & Syntax Discipline:** Highly disciplined in maintaining strict output structures. It isolates data cleanly and is exceptionally stable at generating pure JSON blocks without conversational clutter.
+* **MCP & Tool Orchestration Ready:** Due to its strict formatting adherence, this model is an ideal candidate for serving as a local agent interacting with the Model Context Protocol (MCP), executing API calls, and managing local system states.
+* **Multimodal & Vision Capable:** Flawlessly reads, analyzes, and translates UI screenshots, diagrams, and visual inputs directly into actionable code or structured tool payloads.
+* **Edge Optimized:** Achieves desktop-grade tool-use natively on mobile edge devices using advanced quantization techniques (~6.8 bits with 4-bit text layers and 16-bit vision layers via MLX).
+## 💻 Intended Use Cases
+* **Local AI Agents:** Powering privacy-first, on-device assistants on iOS, iPadOS, and macOS.
+* **System Orchestration:** Translating natural language and visual inputs into structured JSON payloads for tool execution.
+* **Complex Logic Tasks:** Solving dynamic UI challenges, mathematical deductions, and multi-variable logic puzzles on the fly.
+## 🌍 Multilingual Capabilities
+Inheriting the massive linguistic foundation of its base architecture, this model is fluent in **over 100+ languages**. Whether processing inputs or generating complex JSON structures, it maintains high logical fidelity across English, German, French, Spanish, Italian, Dutch, Mandarin, Japanese, Korean, and many more.
 ## 📚 Training Data & Mix
 To achieve the perfect balance between strict syntax discipline and dynamic logic, we curated a massive, multi-tiered dataset:
 ## ⚡️ Inference & Prompt Format
+This model strictly follows the standard Gemma IT prompt template. To utilize its vision capabilities and MCP formatting, ensure your inputs are structured correctly.
+To leverage the model's structural discipline for tool calls, we recommend enforcing rules in your system prompts (e.g., *"You are a local system agent. If you need to use a tool, output ONLY a valid JSON block. Do not add any conversational text before or after the JSON."*).
 ```xml
 <start_of_turn>user
 <start_of_turn>assistant
 ```
+## 🛠️ Usage
 Designed for edge inference, this model shines on Apple Silicon (macOS/iOS) and within fast local environments.
+### 📱 Natively on iOS via Apple MLX
 We highly recommend running this via Apple's `mlx-swift` / `mlx-vlm` libraries for direct Neural Engine & GPU acceleration on iPhones and Macs:
 ```swift
 For `.gguf` variants, the model can be natively loaded into LM Studio. **Crucial:** To enable vision capabilities, you must load the accompanying `-mmproj.gguf` Vision Adapter in the hardware settings alongside the main model.
 ## ⚖️ License
+This model is released under the **Apache 2.0 License**, inheriting the open and permissive nature of its base architecture.
+---
+*Developed with a focus on local AI efficiency by **Prevolut Ltd***
+```