kh0pp commited on
Commit
b9d82af
Β·
verified Β·
1 Parent(s): eea35a6

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +184 -0
README.md ADDED
@@ -0,0 +1,184 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ base_model: AgentFlow/agentflow-planner-7b
4
+ tags:
5
+ - quantized
6
+ - GGUF
7
+ - planning
8
+ - agent
9
+ - reasoning
10
+ - qwen2.5
11
+ model_type: qwen2
12
+ quantized_by: kh0pp
13
+ ---
14
+
15
+ # AgentFlow Planner 7B - GGUF
16
+
17
+ Quantized GGUF versions of [AgentFlow/agentflow-planner-7b](https://huggingface.co/AgentFlow/agentflow-planner-7b) for efficient local inference.
18
+
19
+ ## πŸ“‹ Model Details
20
+
21
+ **AgentFlow Planner 7B** is a specialized language model fine-tuned from Qwen2.5-7B-Instruct, designed specifically for **planning and agentic reasoning tasks**. This model excels at breaking down complex tasks into manageable steps, analyzing dependencies, and creating effective execution plans.
22
+
23
+ ### Base Model Information
24
+ - **Base**: Qwen2.5-7B-Instruct
25
+ - **Parameters**: 7.62 billion
26
+ - **Context Length**: 32,768 tokens
27
+ - **License**: MIT
28
+ - **Specialization**: Planning, multi-step reasoning, tool integration
29
+ - **Original Repository**: [AgentFlow/agentflow-planner-7b](https://huggingface.co/AgentFlow/agentflow-planner-7b)
30
+ - **Research**: [AgentFlow GitHub](https://github.com/lupantech/AgentFlow)
31
+
32
+ ### About AgentFlow
33
+ AgentFlow is an advanced AI framework with four specialized modules:
34
+ - **Planner** (this model): Strategic task decomposition and planning
35
+ - **Executor**: Action execution
36
+ - **Verifier**: Result validation
37
+ - **Generator**: Output synthesis
38
+
39
+ The Planner model has been shown to outperform larger models like GPT-4o on certain planning benchmarks.
40
+
41
+ ## πŸ“¦ Available Quantizations
42
+
43
+ All quantizations were created using llama.cpp's latest quantization methods.
44
+
45
+ | Filename | Quant | Size | Use Case | Memory Required |
46
+ |----------|-------|------|----------|-----------------|
47
+ | `agentflow-planner-7b-f16.gguf` | F16 | 15.0 GB | Full precision, best quality | ~17 GB |
48
+ | `agentflow-planner-7b-Q8_0.gguf` | Q8_0 | 7.6 GB | Near-full quality, faster | ~10 GB |
49
+ | `agentflow-planner-7b-Q5_K_M.gguf` | Q5_K_M | 5.1 GB | High quality | ~7 GB |
50
+ | `agentflow-planner-7b-Q4_K_M.gguf` | Q4_K_M | 4.4 GB | ⭐ **Recommended** - Best balance | ~6 GB |
51
+
52
+ ### Quantization Recommendations
53
+
54
+ - **Q4_K_M**: Best for most users - excellent quality/speed/size balance
55
+ - **Q5_K_M**: When you need slightly higher quality and have more VRAM
56
+ - **Q8_0**: Maximum quality while still being smaller than F16
57
+ - **F16**: Research or when you need absolute best quality
58
+
59
+ ## πŸš€ Usage
60
+
61
+ ### Ollama (Recommended)
62
+
63
+ **Quick Start:**
64
+ ```bash
65
+ # Download the Q4_K_M model
66
+ huggingface-cli download kh0pp/agentflow-planner-7b-GGUF agentflow-planner-7b-Q4_K_M.gguf --local-dir .
67
+
68
+ # Create Modelfile
69
+ cat > Modelfile << 'EOF'
70
+ FROM ./agentflow-planner-7b-Q4_K_M.gguf
71
+
72
+ TEMPLATE """{{ if .System }}<|im_start|>system
73
+ {{ .System }}<|im_end|>
74
+ {{ end }}{{ if .Prompt }}<|im_start|>user
75
+ {{ .Prompt }}<|im_end|>
76
+ {{ end }}<|im_start|>assistant
77
+ {{ .Response }}<|im_end|>
78
+ """
79
+
80
+ PARAMETER temperature 0.7
81
+ PARAMETER top_p 0.9
82
+ PARAMETER top_k 40
83
+ PARAMETER num_ctx 32768
84
+ PARAMETER repeat_penalty 1.1
85
+
86
+ SYSTEM """You are an advanced AI agent specialized in planning and reasoning. You excel at breaking down complex tasks into manageable steps, analyzing dependencies, and creating effective execution plans."""
87
+ EOF
88
+
89
+ # Create and run
90
+ ollama create agentflow-planner:7b -f Modelfile
91
+ ollama run agentflow-planner:7b
92
+ ```
93
+
94
+ ### llama.cpp
95
+
96
+ ```bash
97
+ # Download the model
98
+ huggingface-cli download kh0pp/agentflow-planner-7b-GGUF agentflow-planner-7b-Q4_K_M.gguf --local-dir .
99
+
100
+ # Run with llama.cpp
101
+ ./llama-cli -m agentflow-planner-7b-Q4_K_M.gguf \
102
+ -p "Create a detailed plan for building a web application" \
103
+ -n 512 -c 4096
104
+ ```
105
+
106
+ ### LM Studio
107
+
108
+ 1. Download any GGUF file from this repository
109
+ 2. Load it in LM Studio
110
+ 3. Use the Qwen2 chat template
111
+ 4. Recommended settings:
112
+ - Temperature: 0.7
113
+ - Top P: 0.9
114
+ - Context: 32768
115
+
116
+ ### Python (llama-cpp-python)
117
+
118
+ ```python
119
+ from llama_cpp import Llama
120
+
121
+ llm = Llama(
122
+ model_path="agentflow-planner-7b-Q4_K_M.gguf",
123
+ n_ctx=32768,
124
+ n_gpu_layers=-1, # Use GPU acceleration
125
+ )
126
+
127
+ response = llm.create_chat_completion(
128
+ messages=[
129
+ {"role": "system", "content": "You are an advanced AI agent specialized in planning and reasoning."},
130
+ {"role": "user", "content": "Create a detailed project plan for developing a mobile app"}
131
+ ],
132
+ temperature=0.7,
133
+ max_tokens=512,
134
+ )
135
+
136
+ print(response['choices'][0]['message']['content'])
137
+ ```
138
+
139
+ ## πŸ’‘ Example Use Cases
140
+
141
+ This model excels at:
142
+
143
+ - **Project Planning**: Breaking down complex projects into phases and tasks
144
+ - **Code Architecture**: Designing system architectures and implementation strategies
145
+ - **Research Planning**: Creating research methodologies and experiment designs
146
+ - **Workflow Optimization**: Analyzing and improving processes
147
+ - **Multi-Step Problem Solving**: Decomposing complex problems into solvable steps
148
+ - **Tool Integration**: Planning how to use multiple tools to accomplish goals
149
+
150
+ ## πŸ”§ Technical Details
151
+
152
+ - **Quantization Method**: llama.cpp Q4_K_M, Q5_K_M, Q8_0, F16
153
+ - **Original Format**: SafeTensors (7 files, ~30GB)
154
+ - **Conversion Tool**: llama.cpp convert_hf_to_gguf.py
155
+ - **Tested With**: Ollama 0.1.9+, llama.cpp (latest), LM Studio 0.2.9+
156
+
157
+ ## πŸ“Š Performance Notes
158
+
159
+ - **Q4_K_M** provides the best balance for most use cases with minimal quality loss
160
+ - **Q5_K_M** offers slightly better quality at the cost of ~15% larger file size
161
+ - **Q8_0** provides near-original quality, useful for critical planning tasks
162
+ - **F16** is the full precision version, recommended only for research or quality comparison
163
+
164
+ ## πŸ™ Credits
165
+
166
+ - **Original Model**: [AgentFlow Team](https://huggingface.co/AgentFlow)
167
+ - **Base Model**: [Qwen Team](https://huggingface.co/Qwen)
168
+ - **Quantization**: kh0pp
169
+ - **Tools**: llama.cpp by @ggerganov and contributors
170
+
171
+ ## πŸ“„ License
172
+
173
+ MIT License - Same as the original AgentFlow Planner model.
174
+
175
+ ## πŸ”— Links
176
+
177
+ - **Original Model**: https://huggingface.co/AgentFlow/agentflow-planner-7b
178
+ - **AgentFlow Research**: https://github.com/lupantech/AgentFlow
179
+ - **llama.cpp**: https://github.com/ggerganov/llama.cpp
180
+ - **Ollama**: https://ollama.ai
181
+
182
+ ---
183
+
184
+ *First GGUF quantization of AgentFlow Planner 7B. If you find this useful, consider starring the original model repository!*