Spaces:
Sleeping
A newer version of the Gradio SDK is available:
6.3.0
Usage Guide
Quick Start (5 minutes)
Step 1: Install Dependencies
pip install -r requirements.txt
Step 2: Configure API Keys
Copy .env.example to .env and fill in your API keys:
cp .env.example .env
# Edit .env file with your keys
Minimum required:
HUGGINGFACEHUB_API_TOKEN(for LLM)TAVILY_API_KEY(for web search)
Step 3: Setup Vector Database
python setup_chromadb.py
This will:
- Download embedding model (~90MB)
- Load metadata.jsonl
- Create local ChromaDB database
Step 4: Run the Agent
Option A: Gradio UI (Recommended)
python app.py
Then:
- Log in with your HuggingFace account
- Click "Run Evaluation & Submit All Answers"
- Wait for results
Option B: Command Line
python agent.py
Edit the question in agent.py line 208.
Understanding the Output
Agent Messages
The agent produces several message types:
- SystemMessage: Instructions and guidelines
- HumanMessage: Your question + similar example (from RAG)
- AIMessage: Agent's reasoning and tool calls
- ToolMessage: Tool execution results
- Final AIMessage: The answer
Example:
SystemMessage: You are Alfred, an intelligent research assistant...
HumanMessage: What is quantum computing?
HumanMessage: Here is a similar question... (from vector DB)
AIMessage: I will use deep_research tool
tool_calls: [{"name": "deep_research", "args": {"query": "..."}}]
ToolMessage: DEEP RESEARCH REPORT: ...
AIMessage: FINAL ANSWER: Quantum computing is...
Deep Research Report Structure
DEEP RESEARCH REPORT: [Query]
=====================================
π OVERVIEW
- Total sources: X
- Wikipedia: Y articles
- Web: Z pages
- Academic: W papers
π WIKIPEDIA FINDINGS
[Source 1] ...
[Source 2] ...
π WEB FINDINGS
[Source 1] ...
[Source 2] ...
π ACADEMIC FINDINGS
[Source 1] ...
[Source 2] ...
π ALL SOURCES
[1] https://...
[2] https://...
Customization
Changing LLM Provider
Edit agent.py line 217:
# Option 1: HuggingFace (Free, slower)
graph = build_graph(provider="huggingface")
# Option 2: Groq (Fast, free tier available)
graph = build_graph(provider="groq")
# Option 3: Google Gemini (Balanced, requires payment)
graph = build_graph(provider="google")
Adjusting Deep Research Behavior
Edit deep_research_tool.py:
# Line 54: Wikipedia results
WikipediaLoader(query=query, load_max_docs=2) # Change to 1-5
# Line 68: Web results
TavilySearchResults(max_results=10) # Change to 3-15
# Line 85: Academic results
ArxivLoader(query=query, load_max_docs=5) # Change to 1-10
# Line 58, 75, 89: Content truncation
"content": doc.page_content[:2000] # Change to 500-3000
Modifying System Prompt
Edit system_prompt.txt to:
- Change tool selection strategy
- Adjust reasoning guidelines
- Modify output format
Testing Different Question Types
Mathematical Questions
question = "What is 15 multiplied by 23?"
# Expected: Uses multiply tool β Direct answer
Simple Factual Questions
question = "Who invented the telephone?"
# Expected: Uses wiki_search β Quick answer
Complex Conceptual Questions
question = "Explain quantum entanglement and its applications"
# Expected: Uses deep_research β Comprehensive answer
Recent Events
question = "What are the latest AI developments in 2025?"
# Expected: Uses web_search or deep_research
Troubleshooting
Issue: "No module named 'sentence_transformers'"
pip install sentence-transformers
Issue: "TAVILY_API_KEY not found"
Make sure .env file exists and contains:
TAVILY_API_KEY=tvly-xxxxx
Issue: ChromaDB not working
Delete and recreate:
rm -rf chroma_db/
python setup_chromadb.py
Issue: Agent not using deep_research
Check system_prompt.txt - make sure it mentions when to use deep_research.
Or be explicit in your question:
question = "Use deep research to analyze quantum computing"
Performance Optimization
Speed vs Quality Trade-offs
Faster (for testing):
# deep_research_tool.py
WikipediaLoader(load_max_docs=1)
TavilySearchResults(max_results=3)
ArxivLoader(load_max_docs=1)
"content": doc.page_content[:500]
Balanced (recommended):
WikipediaLoader(load_max_docs=2)
TavilySearchResults(max_results=10)
ArxivLoader(load_max_docs=5)
"content": doc.page_content[:2000]
Comprehensive (slower but thorough):
WikipediaLoader(load_max_docs=5)
TavilySearchResults(max_results=15)
ArxivLoader(load_max_docs=10)
"content": doc.page_content[:5000]
Advanced Usage
Adding Custom Tools
- Define tool in
agent.py:
@tool
def my_custom_tool(query: str) -> str:
"""Description of what this tool does."""
# Your implementation
return result
- Add to tools list:
tools = [
multiply, add, subtract,
wiki_search, web_search,
deep_research,
my_custom_tool, # Add here
]
- Update
system_prompt.txtto mention when to use it.
Batch Processing
from agent import build_graph
from langchain_core.messages import HumanMessage
graph = build_graph(provider="huggingface")
questions = [
"Question 1",
"Question 2",
"Question 3",
]
for q in questions:
result = graph.invoke({"messages": [HumanMessage(content=q)]})
answer = result["messages"][-1].content
print(f"Q: {q}\nA: {answer}\n")
Logging and Debugging
Add logging to track agent behavior:
# In agent.py, modify assistant function:
def assistant(state: MessagesState):
"""Assistant node"""
print(f"\n{'='*60}")
print("Assistant Input:")
for msg in state["messages"]:
print(f" - {type(msg).__name__}: {msg.content[:100]}...")
result = llm_with_tools.invoke(state["messages"])
print("\nAssistant Output:")
if hasattr(result, "tool_calls") and result.tool_calls:
print(f" Tool calls: {[tc['name'] for tc in result.tool_calls]}")
print(f"{'='*60}\n")
return {"messages": [result]}
FAQ
Q: How do I know which tool was used?
A: Check the AIMessage for tool_calls field, or add logging as shown above.
Q: Can I use without Tavily API?
A: Yes, but web_search and deep_research will partially fail. Consider removing them from the tools list.
Q: How long does setup take?
A: ~5 minutes (mostly downloading the embedding model).
Q: Can I run this offline?
A: No, it requires API calls to LLM and search services.
Q: How much does it cost?
A: Using HuggingFace Inference API is free (with rate limits). Tavily has a free tier (1000 queries/month).
Next Steps
- Test with different question types
- Optimize performance for your use case
- Customize system prompt
- Add domain-specific tools
- Integrate with your application
For more details, see the full documentation in the main repository.