Nikita Miroshnichenko commited on
Commit
7f7d760
·
unverified ·
1 Parent(s): af3a044

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +147 -0
README.md ADDED
@@ -0,0 +1,147 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <!--
2
+ ___ _ _ _
3
+ / _ \ | | | | | |
4
+ / /_\ \__ _ _ __ __ _ __| | ___| |__ ___ __| | ___ ___
5
+ | _ / _` | '__/ _` |/ _` |/ _ \ '_ \ / _ \ / _` |/ _ \/ __|
6
+ | | | | (_| | | | (_| | (_| | __/ | | | (_) | (_| | __/\__ \
7
+ \_| |_/\__,_|_| \__,_|\__,_|\___|_| |_|\___/ \__,_|\___||___/
8
+
9
+ Welcome to **Ankelodon**, a modular multi‑agent framework for complex question answering and data analysis.
10
+ This project leverages [LangGraph](https://python.langgraph.org/) and [LangChain](https://python.langchain.com/) to orchestrate a suite of tools that can plan, execute and validate tasks on your behalf.
11
+
12
+ -->
13
+
14
+ # 🧬 Ankelodon Multi‑Agent System
15
+
16
+ **Ankelodon** is a proof‑of‑concept multi‑tool agent inspired by the GAIA evaluation framework.
17
+ It combines planning, execution and critique to solve open‑ended queries that might involve search, file analysis, mathematics, coding or image understanding.
18
+ By breaking down tasks into manageable steps and selecting the right tool for each job, Ankelodon aims to deliver accurate answers with verifiable evidence.
19
+
20
+ ![project logo](docs/images/ankelodon_banner.png)
21
+
22
+ > *Note: The banner above is a placeholder. You can replace it with your own image placed at `docs/images/ankelodon_banner.png`.*
23
+
24
+ ## 🌟 Features
25
+
26
+ ### 🧠 Complexity assessment & routing
27
+
28
+ Before doing any heavy lifting, Ankelodon evaluates the incoming query to determine whether it requires planning or can be answered directly.
29
+ Simple questions (e.g. definitions, single mathematical operations) are answered via a lightweight executor.
30
+ Moderate and complex queries trigger the planner and agent pipeline, ensuring appropriate decomposition and tool usage【942452390578334†L22-L34】.
31
+
32
+ ### 🧭 Structured planning
33
+
34
+ For non‑trivial tasks, a **planner** LLM generates a structured plan consisting of a series of steps.
35
+ Each step has an ID, goal, selected tool, expected result and fallback strategy.
36
+ The plan is stored as a Pydantic model (`PlannerPlan`) with strong typing for reliability【981681905155103†L82-L100】.
37
+
38
+ ### 🤖 Agent execution
39
+
40
+ The **agent** node follows the plan step‑by‑step.
41
+ For each step it first produces reasoning, then invokes the suggested tool with the appropriate inputs.
42
+ Tool outputs are captured and fed back into subsequent reasoning.
43
+ The agent continues until all steps are complete or an error requires replanning【981681905155103†L161-L186】.
44
+
45
+ ### 🧰 Rich toolset
46
+
47
+ Ankelodon exposes a curated set of tools bound to the execution LLM:
48
+
49
+ | Tool | Purpose |
50
+ |---|---|
51
+ | `download_file_from_url` | Download files from the web by URL |
52
+ | `web_search` | Perform internet search via Tavily API |
53
+ | `arxiv_search` | Find relevant academic papers on arXiv |
54
+ | `wiki_search` | Fetch Wikipedia articles and summaries |
55
+ | `add`, `subtract`, `multiply`, `divide`, `power` | Basic arithmetic operations |
56
+ | `analyze_excel_file`, `analyze_csv_file` | Parse spreadsheets and compute statistics |
57
+ | `analyze_docx_file`, `analyze_pdf_file`, `analyze_txt_file` | Extract and summarise document content |
58
+ | `vision_qa_gemma` | Answer questions about images using a vision model |
59
+ | `safe_code_run` | Execute Python code securely in an isolated environment |
60
+
61
+ These tools are loaded into a `ToolNode` and passed to the agent for use during execution【774776463100239†L10-L14】.
62
+
63
+ ### 📝 Comprehensive reporting & critique
64
+
65
+ After the agent finishes, a deterministic LLM generates a structured execution report.
66
+ This report summarises the query, steps taken, key findings, sources used, and the final answer.
67
+ A separate **critic** LLM evaluates the report for completeness, accuracy, methodology and evidence, scoring it out of 10 and suggesting improvements if necessary【981681905155103†L459-L525】.
68
+ The system may then replan and re‑execute until the answer meets quality thresholds.
69
+
70
+ ## 🏗 Architecture
71
+
72
+ Ankelodon is built as a directed acyclic graph of nodes. The high‑level flow is:
73
+
74
+ 1. **INPUT** – Receive the user query and optional files.
75
+ 2. **COMPLEXITY_ASSESSOR** – Classify the query as simple, moderate or complex and decide whether to plan.
76
+ 3. **PLANNING** – Generate a multi‑step plan when needed, using examples and strict rules about tool usage and numerical computation.
77
+ 4. **AGENT** – Iterate through the plan: reason about each step, call a tool, capture results and update state.
78
+ 5. **TOOLS** – Execute selected tools via a unified `ToolNode`.
79
+ 6. **FINALIZER** – Consolidate the execution into a report and extract a formatted final answer.
80
+ 7. **CRITIC** – Score the report and decide whether to accept or trigger the **REPLANNER**.
81
+
82
+ The graph is compiled using LangGraph’s `StateGraph` API and is flexible enough to be extended with new nodes or tools【942452390578334†L8-L50】.
83
+
84
+ ## 🚀 Getting started
85
+
86
+ ### Prerequisites
87
+
88
+ This project targets **Python 3.10+**. You’ll need API keys or credentials for any external services (e.g. OpenAI, Tavily, Gemini) used by tools.
89
+ Assuming you have a virtual environment activated:
90
+
91
+ ```bash
92
+ pip install langchain==0.1.* langgraph openai google-generativeai
93
+ # plus any other packages referenced in tools (pandas, numpy, pillow, tldextract, etc.)
94
+ ```
95
+
96
+ ### Running a simple query
97
+
98
+ The entry point is the `build_workflow` function in `src/agent.py`. It returns a compiled system you can invoke with a dictionary representing the agent state.
99
+ A minimal example:
100
+
101
+ ```python
102
+ from src.agent import build_workflow
103
+
104
+ # Initialize the graph
105
+ system = build_workflow()
106
+
107
+ # Build the initial state
108
+ state = {
109
+ "query": "What is the square root of 144?",
110
+ "messages": [],
111
+ "files": [],
112
+ "iteration_count": 0,
113
+ "max_iterations": 3
114
+ }
115
+
116
+ # Invoke the system and get the result
117
+ result = system.invoke(state)
118
+ print(result.get("final_answer")) # should output: FINAL ANSWER: 12
119
+ ```
120
+
121
+ For more complex tasks involving file uploads or web searches, provide file paths in the `files` list and ensure appropriate API keys are set in the environment.
122
+
123
+ ### Notebooks & examples
124
+
125
+ There are example notebooks under `src/` and `test_folder/` demonstrating how to test the agent with sample queries and data.
126
+ Feel free to explore and adapt them to your own scenarios.
127
+
128
+ ## 🛣 Roadmap & GAIA adaptation
129
+
130
+ - Integrate unit conversion, date arithmetic and table operations to handle GAIA evaluation tasks out‑of‑the‑box.
131
+ - Add question‑clarification and error‑recovery loops to minimise unnecessary replanning.
132
+ - Streamline the tool list by removing unused tools and grouping related operations.
133
+ - Improve caching of external calls (e.g. web search, downloads) to speed up repeated queries.
134
+ - Expand the test suite and add continuous integration.
135
+
136
+ ## 🤝 Contributing
137
+
138
+ Contributions are welcome! If you find a bug or have an idea for improvement, feel free to open an issue or a pull request.
139
+ When adding new tools or nodes, please ensure they adhere to the structured planning and execution patterns shown here, and update the tests accordingly.
140
+
141
+ ## 📄 License
142
+
143
+ This project is released under the MIT License. See `LICENSE` for details.
144
+
145
+ ---
146
+
147
+ *Ankelodon is a work in progress. Your feedback and use‑cases will help shape its future. Happy hacking!* 🦾