Spaces:

Punit1
/

research-agent

Sleeping

App Files Files Community

Punit1 commited on Feb 9

Commit

f50852a

1 Parent(s): 52e9d16

Add Dockerfile and switch to Docker SDK

Browse files

Files changed (4) hide show

DEPLOY_TO_HF.md +0 -53
Dockerfile +28 -0
PROJECT_README.md +0 -187
walkthrough.md.resolved +0 -256

DEPLOY_TO_HF.md DELETED Viewed

@@ -1,53 +0,0 @@
-# Quick Deployment to Hugging Face Spaces
-## TL;DR - Fast Deployment Steps
-### 1. Get API Keys
-- Groq: https://console.groq.com/
-- Tavily: https://tavily.com/
-### 2. Create HF Space
-1. Go to: https://huggingface.co/new-space
-2. Choose: **Streamlit** SDK
-3. Name it: `research-agent`
-4. Create Space
-### 3. Upload Files
-**Using Web Interface:**
-- Upload: `main.py`, `requirements.txt`, entire `src/` folder, `.streamlit/` folder
-- **Rename** `HF_README.md` to `README.md` before uploading
-**Using Git:**
-```bash
-git init
-git add .
-git commit -m "Deploy to HF Spaces"
-git remote add hf https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME
-git push hf main
-```
-### 4. Add Secrets
-In your Space → Settings → Repository secrets:
-- `GROQ_API_KEY` = your Groq API key
-- `TAVILY_API_KEY` = your Tavily API key
-### 5. Done!
-Your app will be live at: `https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME`
----
-## Files Checklist
-✅ All files are ready in your project:
-- [x] `main.py` - Main app
-- [x] `requirements.txt` - Dependencies
-- [x] `src/` - Source code
-- [x] `.streamlit/config.toml` - HF configuration
-- [x] `HF_README.md` - Space README (rename to README.md)
-- [x] `.gitignore` - Ignore unnecessary files
-**Your project is deployment-ready!** 🚀
-For detailed instructions, see: `hf_deployment_guide.md`

Dockerfile ADDED Viewed

	@@ -0,0 +1,28 @@

+FROM python:3.10-slim
+WORKDIR /app
+# Install system dependencies
+RUN apt-get update && apt-get install -y \
+    build-essential \
+    curl \
+    software-properties-common \
+    && rm -rf /var/lib/apt/lists/*
+# Copy requirements first for better caching
+COPY requirements.txt .
+# Install Python dependencies
+RUN pip install --no-cache-dir -r requirements.txt
+# Copy application code
+COPY . .
+# Expose port 7860 (HF Spaces default)
+EXPOSE 7860
+# Health check
+HEALTHCHECK CMD curl --fail http://localhost:7860/_stcore/health
+# Run the application
+CMD ["streamlit", "run", "main.py", "--server.port=7860", "--server.address=0.0.0.0"]

PROJECT_README.md DELETED Viewed

@@ -1,187 +0,0 @@
-# Autonomous Research Agent with LangGraph, Groq, and Streamlit
-This repository contains the complete source code for an **autonomous AI research agent**. The agent takes a user-defined topic, performs web searches to gather information, evaluates and summarizes relevant sources, and compiles the findings into a comprehensive report.
-The project is built using a modern AI stack, showcasing a stateful, cyclic architecture that enables complex, multi-step reasoning and execution, all presented through an interactive web interface.
----
-## Core Technologies
-- **Orchestration:** `LangGraph` – Build stateful, multi-actor applications with cycles, enabling complex agentic behaviors.
-- **LLM:** `Groq (Llama 3.3 70B)` – High-speed inference using a Language Processing Unit (LPU) for fast and responsive AI reasoning.
-- **Web Interface:** `Streamlit` – Interactive and user-friendly chat-based web application built entirely in Python.
-- **Search Tool:** `Tavily AI` – AI-optimized search engine to gather accurate and relevant information from the web.
-- **Core Framework:** `LangChain` – Provides foundational components, tools, and integrations.
----
-## Key Features
-- **Stateful, Cyclic Architecture:**
-  Uses LangGraph loops to iteratively search, evaluate, and decide whether to continue researching or compile findings, mimicking a human research process.
-- **High-Performance LLM:**
-  Leverages Groq LPU with Llama 3.3 70B for reasoning and content generation at extremely high speeds for a seamless user experience.
-- **Fault Tolerance and Persistence:**
-  Saves the agent's state at every step using `SqliteSaver` checkpointer, allowing long-running tasks to resume from the exact point of failure.
-- **Interactive Web UI:**
-  Streamlit-based chat interface lets users input topics, monitor progress in real-time, and receive the final report directly in the app.
-- **Deep Observability with LangSmith:**
-  Provides detailed traces of every agent step for debugging and understanding complex behavior (optional).
----
-## Setup Instructions
-### Prerequisites
-You will need two API keys:
-1. **Groq API Key** - Sign up at [console.groq.com](https://console.groq.com/)
-2. **Tavily API Key** - Sign up at [tavily.com](https://tavily.com/)
-### Installation
-1. **Clone the repository** (or navigate to the project directory)
-```bash
-cd "Research Agent with LangGraph"
-```
-2. **Create and activate a virtual environment**
-```powershell
-# Create virtual environment
-python -m venv venv
-# Activate it (Windows PowerShell)
-.\venv\Scripts\Activate.ps1
-```
-3. **Install dependencies**
-```powershell
-.\venv\Scripts\python.exe -m pip install --upgrade pip setuptools wheel
-.\venv\Scripts\python.exe -m pip install -r requirements.txt
-```
-4. **Configure environment variables**
-Create or edit the `.env` file in the root directory and add your API keys:
-```env
-GROQ_API_KEY=your_groq_api_key_here
-TAVILY_API_KEY=your_tavily_api_key_here
-```
-You can use `.env.example` as a template.
----
-## Running the Application
-Run the Streamlit app with:
-```powershell
-.\venv\Scripts\python.exe -m streamlit run main.py
-```
-The app will automatically open in your browser at `http://localhost:8501`
----
-## Usage
-1. Open the application in your browser
-2. Enter a research topic in the chat input (e.g., "Recent advances in AI agents")
-3. Watch the agent work:
-   - 🔍 Search for relevant articles
-   - 📄 Scrape content from URLs
-   - 🤖 Evaluate relevance using the LLM
-   - 📝 Summarize useful information
-   - 📊 Compile a comprehensive report
-4. Review the final research report
----
-## Project Structure
-```
-Research Agent with LangGraph/
-├── main.py                 # Streamlit UI and application entry point
-├── src/
-│   ├── graph.py           # LangGraph workflow and node definitions
-│   ├── agent_state.py     # Agent state schema
-│   └── tools.py           # Search and scraping tools
-├── requirements.txt       # Python dependencies
-├── .env                   # API keys (create this file)
-├── .env.example          # Template for environment variables
-└── checkpoints.sqlite    # SQLite database for state persistence
-```
----
-## Troubleshooting
-### Issue: "streamlit.exe not found" or Import Errors
-**Solution:** Recreate the virtual environment from scratch:
-```powershell
-# Delete old venv
-Remove-Item -Recurse -Force venv
-# Create fresh venv
-python -m venv venv
-# Upgrade pip
-.\venv\Scripts\python.exe -m pip install --upgrade pip setuptools wheel
-# Install dependencies
-.\venv\Scripts\python.exe -m pip install -r requirements.txt
-```
-### Issue: API Key Errors
-**Solution:** Ensure your `.env` file contains valid API keys and is in the project root directory.
----
-## How It Works
-The agent uses a **cyclic LangGraph workflow**:
-1. **Search Node** → Searches web using Tavily API
-2. **Scrape & Summarize Node** → Scrapes URLs one by one, evaluates relevance, and summarizes
-3. **Router** → Decides to continue scraping or compile report
-4. **Compile Report Node** → Synthesizes all summaries into a final report
-Each step's state is saved to SQLite, enabling fault tolerance.
----
-## Optional: LangSmith Tracing
-To enable detailed tracing and debugging, add to your `.env`:
-```env
-LANGCHAIN_TRACING_V2=true
-LANGCHAIN_API_KEY=your_langsmith_api_key
-LANGCHAIN_PROJECT=research-agent
-```
----
-## License
-MIT License - Feel free to use and modify this project.
----
-## Contributing
-Contributions are welcome! Feel free to open issues or submit pull requests.

walkthrough.md.resolved DELETED Viewed

@@ -1,256 +0,0 @@
-# Research Agent Project - Analysis & Setup Guide
-## What This Project Does
-This is an **Autonomous Research Agent** built with a modern AI stack that:
-1. 🔍 **Searches** the web for articles on a given topic (using Tavily AI)
-2. 📄 **Scrapes** content from the discovered URLs
-3. 🤖 **Evaluates** each article for relevance using an LLM
-4. 📝 **Summarizes** relevant content
-5. 📊 **Compiles** a comprehensive research report
-### Architecture
-The agent uses **LangGraph** to create a stateful, cyclic workflow:
-```mermaid
-graph LR
-    A[User Input Topic] --> B[Search Node]
-    B --> C[Scrape & Summarize Node]
-    C --> D{More URLs?}
-    D -->|Yes| C
-    D -->|No| E[Compile Report Node]
-    E --> F[Final Report]
-```
-### Technology Stack
-- **LangGraph**: Orchestration of the stateful workflow
-- **Groq**: High-speed LLM inference (Llama 3.3 70B)
-- **Streamlit**: Interactive web interface
-- **Tavily AI**: AI-optimized web search
-- **SQLite Checkpointer**: Fault-tolerant state persistence
----
-## Enhancements Made
-Since this project was built 5 months ago, I made the following updates:
-### 1. Updated LLM Model
-**[src/graph.py](file:///c:/Users/punit/Desktop/project/GenAI/Research%20Agent%20with%20LangGraph/src/graph.py#L14-L18)**
-Changed from `openai/gpt-oss-120b` (outdated/unavailable) to `llama-3.3-70b-versatile`:
-```diff
-llm = ChatGroq(
--    model="openai/gpt-oss-120b",
-+    model="llama-3.3-70b-versatile",
-    temperature=0,
-    api_key=os.getenv("GROQ_API_KEY")
-)
-```
-### 2. Created Environment Configuration Template
-**[.env.example](file:///c:/Users/punit/Desktop/project/GenAI/Research%20Agent%20with%20LangGraph/.env.example)**
-Added a template to help configure the required API keys.
-### 3. Fixed Dependency Installation Issues
-**Problem:** The initial virtual environment had corrupted dependencies causing import errors.
-**Solution:** Recreated the virtual environment from scratch:
-1. Deleted old `venv` folder
-2. Created fresh virtual environment
-3. Upgraded pip, setuptools, and wheel
-4. Installed all dependencies from [requirements.txt](file:///c:/Users/punit/Desktop/project/GenAI/Research%20Agent%20with%20LangGraph/requirements.txt)
----
-## How to Run
-### Prerequisites
-You need two API keys:
-1. **Groq API Key** - Get from [console.groq.com](https://console.groq.com/)
-2. **Tavily API Key** - Get from [tavily.com](https://tavily.com/)
-### Step 1: Configure API Keys
-Edit your [.env](file:///c:/Users/punit/Desktop/project/GenAI/Research%20Agent%20with%20LangGraph/.env) file and add:
-```env
-GROQ_API_KEY=your_groq_api_key_here
-TAVILY_API_KEY=your_tavily_api_key_here
-```
-> [!IMPORTANT]
-> The [.env](file:///c:/Users/punit/Desktop/project/GenAI/Research%20Agent%20with%20LangGraph/.env) file already exists in the project but needs to be configured with valid API keys.
-### Step 2: Run the Application
-Use this command to run the application:
-```powershell
-.\venv\Scripts\python.exe -m streamlit run main.py
-```
-> [!TIP]
-> **Alternative command** (if the above doesn't work):
-> ```powershell
-> python -m streamlit run main.py
-> ```
-The app will start and automatically open in your browser at `http://localhost:8501`
-### Step 3: Use the Agent
-1. Enter a research topic (e.g., "LangGraph features" or "AI agents in 2026")
-2. Watch the agent:
-   - Search for articles
-   - Evaluate each URL for relevance
-   - Summarize relevant content
-   - Compile the final report
-3. Review the comprehensive research report
----
-## Troubleshooting
-### Issue: "streamlit.exe not found"
-**Cause:** Dependencies weren't properly installed in the virtual environment.
-**Solution:** Recreate the virtual environment:
-```powershell
-# Delete old venv
-Remove-Item -Recurse -Force venv
-# Create new venv
-python -m venv venv
-# Upgrade pip
-.\venv\Scripts\python.exe -m pip install --upgrade pip setuptools wheel
-# Install dependencies
-.\venv\Scripts\python.exe -m pip install -r requirements.txt
-```
-### Issue: Import errors (pydantic, zstandard, etc.)
-**Cause:** Corrupted package installations.
-**Solution:** Follow the steps above to recreate the virtual environment completely.
-### Issue: "GROQ_API_KEY not set"
-**Cause:** Missing or improperly configured [.env](file:///c:/Users/punit/Desktop/project/GenAI/Research%20Agent%20with%20LangGraph/.env) file.
-**Solution:** Ensure your [.env](file:///c:/Users/punit/Desktop/project/GenAI/Research%20Agent%20with%20LangGraph/.env) file contains valid API keys.
----
-## Project Evaluation
-### ✅ Strengths
-- **Well-architected**: Clean separation of concerns (state, graph, tools)
-- **Fault-tolerant**: SQLite checkpointer saves state at every step
-- **Modern stack**: Uses cutting-edge tools (LangGraph, Groq LPU)
-- **User-friendly**: Streamlit provides excellent UX with real-time progress tracking
-### 🔄 Potential Enhancements
-While the project is solid, here are some optional improvements:
-1. **Error Handling**
-   - Add retry logic for failed web requests
-   - Handle rate limits from Groq/Tavily APIs
-2. **Content Quality**
-   - Implement a scoring system for source credibility
-   - Add citation tracking in the final report
-3. **Performance**
-   - Parallelize URL scraping (currently sequential)
-   - Add caching for previously scraped URLs
-4. **Features**
-   - Export reports to PDF/Markdown
-   - Save research history
-   - Allow users to specify number of sources to research
-5. **Observability**
-   - Enable LangSmith tracing for debugging (already supported, just needs env vars)
-   - Add metrics dashboard (search count, success rate, etc.)
-6. **Testing**
-   - Add unit tests for individual nodes
-   - Create integration tests for the full workflow
----
-## Technical Deep Dive
-### Key Files
-| File | Purpose |
-|------|---------|
-| [main.py](file:///c:/Users/punit/Desktop/project/GenAI/Research%20Agent%20with%20LangGraph/main.py) | Streamlit UI and session management |
-| [src/graph.py](file:///c:/Users/punit/Desktop/project/GenAI/Research%20Agent%20with%20LangGraph/src/graph.py) | LangGraph workflow definition and node functions |
-| [src/agent_state.py](file:///c:/Users/punit/Desktop/project/GenAI/Research%20Agent%20with%20LangGraph/src/agent_state.py) | TypedDict defining the agent's state schema |
-| [src/tools.py](file:///c:/Users/punit/Desktop/project/GenAI/Research%20Agent%20with%20LangGraph/src/tools.py) | Search and scraping tools |
-### How the Workflow Works
-1. **Search Node** ([graph.py:L23-L30](file:///c:/Users/punit/Desktop/project/GenAI/Research%20Agent%20with%20LangGraph/src/graph.py#L23-L30))
-   - Invokes Tavily search
-   - Extracts URLs from results
-   - Updates state with URLs list
-2. **Scrape & Summarize Node** ([graph.py:L32-L69](file:///c:/Users/punit/Desktop/project/GenAI/Research%20Agent%20with%20LangGraph/src/graph.py#L32-L69))
-   - Pops one URL from the list
-   - Scrapes content using BeautifulSoup
-   - Asks LLM to summarize if relevant (or return "IRRELEVANT")
-   - Adds summary to state if relevant
-3. **Routing Logic** ([graph.py:L91-L98](file:///c:/Users/punit/Desktop/project/GenAI/Research%20Agent%20with%20LangGraph/src/graph.py#L91-L98))
-   - If URLs remain → loop back to scrape another
-   - If no URLs → proceed to compile report
-4. **Compile Report Node** ([graph.py:L71-L87](file:///c:/Users/punit/Desktop/project/GenAI/Research%20Agent%20with%20LangGraph/src/graph.py#L71-L87))
-   - Takes all summaries
-   - Synthesizes into a coherent report
-   - Returns final report to user
----
-## Example Usage
-**Topic:** "Benefits of LangGraph"
-**Agent Process:**
-1. Searches Tavily → finds 5 relevant articles
-2. Scrapes Article 1 → relevant → summarizes
-3. Scrapes Article 2 → not relevant → skips
-4. Scrapes Article 3 → relevant → summarizes
-5. Scrapes Article 4 → relevant → summarizes
-6. Scrapes Article 5 → relevant → summarizes
-7. Compiles final report from 4 summaries
-**Result:** A comprehensive report covering LangGraph's benefits, compiled from 4 high-quality sources.
----
-## Summary
-✅ **Project is now fully functional!**
-- Updated LLM model to `llama-3.3-70b-versatile`
-- Fixed all dependency installation issues
-- Application running successfully on `http://localhost:8501`
-**Next steps:** Configure your API keys in the [.env](file:///c:/Users/punit/Desktop/project/GenAI/Research%20Agent%20with%20LangGraph/.env) file and start researching!