--- datasets: - bigcode/the-stack - bigcode/the-stack-v2 - bigcode/starcoderdata - bigcode/commitpack --- Llama-Coyote.Coder-4B (GGUF) 📌 Model Overview Model Name: WithinUsAI/Llama-Coyote.Coder-4B.gguf Organization: Within Us AI Model Type: Code LLM (Instruction-Tuned, Agentic-Oriented) Parameter Size: 4B Format: GGUF (quantized for local inference) Primary Focus: Efficient coding + reasoning for local deployment This model is part of the Within Us AI ecosystem of compact, high-performance coding models, designed to run locally while still delivering structured reasoning and practical software engineering output.  ⸻ 🧬 Architecture & Lineage * Base Family: LLaMA-derived architecture (inferred from naming and ecosystem patterns) * Model Class: Dense transformer (~4B parameters) * Optimization Strategy: * Instruction tuning for coding tasks * Reasoning-aware outputs * GGUF quantization for edge deployment Ecosystem Position This model sits alongside: * Other 4B coding models * Agentic coders * Reasoning-distilled systems WithinUsAI focuses on agentic AI, tool use, and evaluation-driven training pipelines.  ⸻ 🧠 Core Design Philosophy Think of this model like a desert-hardened code hunter 🐺💻 Lean, efficient, and tuned to track down solutions without wasting compute. Design Goals: * Maximize coding performance per parameter * Encourage structured, step-by-step reasoning * Enable local-first AI development * Support agent-style workflows ⸻ ⚙️ Key Capabilities 💻 Coding * Multi-language support (Python, JS, C++, etc.) * Function generation and refactoring * Debugging assistance * Algorithm design 🤖 Agentic Behavior * Task decomposition * Instruction-following * Compatible with tool-calling frameworks 🧠 Reasoning * Step-by-step logic chains * Problem breakdown * Lightweight analytical reasoning ⸻ 📦 GGUF Format & Deployment Optimized for local inference environments: Supported Runtimes: * llama.cpp * LM Studio * Ollama (GGUF-compatible builds) Typical Quantization Options (4B): Quant RAM Needed Notes Q4_K_M ~3–4 GB Best balance Q5_K_M ~4–5 GB Higher quality Q8_0 ~6–8 GB Maximum fidelity ⸻ 🚀 Intended Use ✅ Ideal Use Cases * Local coding assistants * AI-powered IDE integrations * Autonomous coding agents * Script generation & debugging * Offline development workflows ⚠️ Limitations * Smaller parameter size limits deep reasoning vs larger models * Performance depends on prompt clarity * Tool use requires external orchestration ⸻ 🛠️ Usage Example (llama.cpp) ./main -m Llama-Coyote.Coder-4B.Q4_K_M.gguf \ -p "Write a Python script that monitors file changes and logs them." \ -n 512 ⸻ 🧪 Training & Methodology Within Us AI training approach includes: * Code-focused instruction tuning * Reasoning trace exposure * Evaluation-driven dataset design * Agentic workflow alignment Data Sources * Proprietary datasets created by Within Us AI * Third-party datasets used without ownership claims * Focus on: * Code reasoning * Debugging patterns * Structured outputs ⸻ 📊 Expected Performance Profile Capability Strength Coding High Efficiency Very High Reasoning depth Moderate General knowledge Moderate Agent readiness High ⸻ 📜 License License Type: Custom / Other (Within Us AI License Approach)** Terms: * Base architecture derived from third-party LLM ecosystems (e.g., LLaMA family) * Within Us AI developed: * Fine-tuning process * Model merging techniques * Training methodology * Third-party datasets may be used without ownership claims * Credit belongs to original creators ⸻ 🙏 Acknowledgements * Meta (LLaMA architecture inspiration) * Open-source GGUF / llama.cpp ecosystem * Hugging Face community * Dataset creators and contributors ⸻ 🔗 Links * Model: https://huggingface.co/WithinUsAI/Llama-Coyote.Coder-4B.gguf * Organization: https://huggingface.co/WithinUsAI ⸻ 🧩 Closing Note This one feels like a quiet operator in the sand 🏜️ Not loud. Not oversized. Just tracks the problem… and delivers code that works.