kh0pp
/

agentflow-planner-7b-GGUF

+---
+license: mit
+base_model: AgentFlow/agentflow-planner-7b
+tags:
+  - quantized
+  - GGUF
+  - planning
+  - agent
+  - reasoning
+  - qwen2.5
+model_type: qwen2
+quantized_by: kh0pp
+---
+# AgentFlow Planner 7B - GGUF
+Quantized GGUF versions of [AgentFlow/agentflow-planner-7b](https://huggingface.co/AgentFlow/agentflow-planner-7b) for efficient local inference.
+## 📋 Model Details
+**AgentFlow Planner 7B** is a specialized language model fine-tuned from Qwen2.5-7B-Instruct, designed specifically for **planning and agentic reasoning tasks**. This model excels at breaking down complex tasks into manageable steps, analyzing dependencies, and creating effective execution plans.
+### Base Model Information
+- **Base**: Qwen2.5-7B-Instruct
+- **Parameters**: 7.62 billion
+- **Context Length**: 32,768 tokens
+- **License**: MIT
+- **Specialization**: Planning, multi-step reasoning, tool integration
+- **Original Repository**: [AgentFlow/agentflow-planner-7b](https://huggingface.co/AgentFlow/agentflow-planner-7b)
+- **Research**: [AgentFlow GitHub](https://github.com/lupantech/AgentFlow)
+### About AgentFlow
+AgentFlow is an advanced AI framework with four specialized modules:
+- **Planner** (this model): Strategic task decomposition and planning
+- **Executor**: Action execution
+- **Verifier**: Result validation
+- **Generator**: Output synthesis
+The Planner model has been shown to outperform larger models like GPT-4o on certain planning benchmarks.
+## 📦 Available Quantizations
+All quantizations were created using llama.cpp's latest quantization methods.
+| Filename | Quant | Size | Use Case | Memory Required |
+|----------|-------|------|----------|-----------------|
+| `agentflow-planner-7b-f16.gguf` | F16 | 15.0 GB | Full precision, best quality | ~17 GB |
+| `agentflow-planner-7b-Q8_0.gguf` | Q8_0 | 7.6 GB | Near-full quality, faster | ~10 GB |
+| `agentflow-planner-7b-Q5_K_M.gguf` | Q5_K_M | 5.1 GB | High quality | ~7 GB |
+| `agentflow-planner-7b-Q4_K_M.gguf` | Q4_K_M | 4.4 GB | ⭐ **Recommended** - Best balance | ~6 GB |
+### Quantization Recommendations
+- **Q4_K_M**: Best for most users - excellent quality/speed/size balance
+- **Q5_K_M**: When you need slightly higher quality and have more VRAM
+- **Q8_0**: Maximum quality while still being smaller than F16
+- **F16**: Research or when you need absolute best quality
+## 🚀 Usage
+### Ollama (Recommended)
+**Quick Start:**
+```bash
+# Download the Q4_K_M model
+huggingface-cli download kh0pp/agentflow-planner-7b-GGUF agentflow-planner-7b-Q4_K_M.gguf --local-dir .
+# Create Modelfile
+cat > Modelfile << 'EOF'
+FROM ./agentflow-planner-7b-Q4_K_M.gguf
+TEMPLATE """{{ if .System }}<|im_start|>system
+{{ .System }}<|im_end|>
+{{ end }}{{ if .Prompt }}<|im_start|>user
+{{ .Prompt }}<|im_end|>
+{{ end }}<|im_start|>assistant
+{{ .Response }}<|im_end|>
+"""
+PARAMETER temperature 0.7
+PARAMETER top_p 0.9
+PARAMETER top_k 40
+PARAMETER num_ctx 32768
+PARAMETER repeat_penalty 1.1
+SYSTEM """You are an advanced AI agent specialized in planning and reasoning. You excel at breaking down complex tasks into manageable steps, analyzing dependencies, and creating effective execution plans."""
+EOF
+# Create and run
+ollama create agentflow-planner:7b -f Modelfile
+ollama run agentflow-planner:7b
+```
+### llama.cpp
+```bash
+# Download the model
+huggingface-cli download kh0pp/agentflow-planner-7b-GGUF agentflow-planner-7b-Q4_K_M.gguf --local-dir .
+# Run with llama.cpp
+./llama-cli -m agentflow-planner-7b-Q4_K_M.gguf \
+  -p "Create a detailed plan for building a web application" \
+  -n 512 -c 4096
+```
+### LM Studio
+1. Download any GGUF file from this repository
+2. Load it in LM Studio
+3. Use the Qwen2 chat template
+4. Recommended settings:
+   - Temperature: 0.7
+   - Top P: 0.9
+   - Context: 32768
+### Python (llama-cpp-python)
+```python
+from llama_cpp import Llama
+llm = Llama(
+    model_path="agentflow-planner-7b-Q4_K_M.gguf",
+    n_ctx=32768,
+    n_gpu_layers=-1,  # Use GPU acceleration
+)
+response = llm.create_chat_completion(
+    messages=[
+        {"role": "system", "content": "You are an advanced AI agent specialized in planning and reasoning."},
+        {"role": "user", "content": "Create a detailed project plan for developing a mobile app"}
+    ],
+    temperature=0.7,
+    max_tokens=512,
+)
+print(response['choices'][0]['message']['content'])
+```
+## 💡 Example Use Cases
+This model excels at:
+- **Project Planning**: Breaking down complex projects into phases and tasks
+- **Code Architecture**: Designing system architectures and implementation strategies
+- **Research Planning**: Creating research methodologies and experiment designs
+- **Workflow Optimization**: Analyzing and improving processes
+- **Multi-Step Problem Solving**: Decomposing complex problems into solvable steps
+- **Tool Integration**: Planning how to use multiple tools to accomplish goals
+## 🔧 Technical Details
+- **Quantization Method**: llama.cpp Q4_K_M, Q5_K_M, Q8_0, F16
+- **Original Format**: SafeTensors (7 files, ~30GB)
+- **Conversion Tool**: llama.cpp convert_hf_to_gguf.py
+- **Tested With**: Ollama 0.1.9+, llama.cpp (latest), LM Studio 0.2.9+
+## 📊 Performance Notes
+- **Q4_K_M** provides the best balance for most use cases with minimal quality loss
+- **Q5_K_M** offers slightly better quality at the cost of ~15% larger file size
+- **Q8_0** provides near-original quality, useful for critical planning tasks
+- **F16** is the full precision version, recommended only for research or quality comparison
+## 🙏 Credits
+- **Original Model**: [AgentFlow Team](https://huggingface.co/AgentFlow)
+- **Base Model**: [Qwen Team](https://huggingface.co/Qwen)
+- **Quantization**: kh0pp
+- **Tools**: llama.cpp by @ggerganov and contributors
+## 📄 License
+MIT License - Same as the original AgentFlow Planner model.
+## 🔗 Links
+- **Original Model**: https://huggingface.co/AgentFlow/agentflow-planner-7b
+- **AgentFlow Research**: https://github.com/lupantech/AgentFlow
+- **llama.cpp**: https://github.com/ggerganov/llama.cpp
+- **Ollama**: https://ollama.ai
+---
+*First GGUF quantization of AgentFlow Planner 7B. If you find this useful, consider starring the original model repository!*