|
|
--- |
|
|
title: LLM Analysis Quiz Solver |
|
|
emoji: π |
|
|
colorFrom: red |
|
|
colorTo: blue |
|
|
sdk: docker |
|
|
pinned: false |
|
|
app_port: 7860 |
|
|
license: apache-2.0 |
|
|
--- |
|
|
|
|
|
# LLM Analysis - Autonomous Quiz Solver Agent |
|
|
|
|
|
[](https://opensource.org/licenses/MIT) |
|
|
[](https://www.python.org/downloads/) |
|
|
[](https://fastapi.tiangolo.com/) |
|
|
|
|
|
An intelligent, autonomous agent built with LangGraph and LangChain that solves data-related quizzes involving web scraping, data processing, analysis, and visualization tasks. The system uses Google's Gemini 2.5 Flash model to orchestrate tool usage and make decisions. |
|
|
|
|
|
## π Table of Contents |
|
|
|
|
|
- [Overview](#overview) |
|
|
- [Architecture](#architecture) |
|
|
- [Features](#features) |
|
|
- [Project Structure](#project-structure) |
|
|
- [Installation](#installation) |
|
|
- [Configuration](#configuration) |
|
|
- [Usage](#usage) |
|
|
- [API Endpoints](#api-endpoints) |
|
|
- [Tools & Capabilities](#tools--capabilities) |
|
|
- [Docker Deployment](#docker-deployment) |
|
|
- [How It Works](#how-it-works) |
|
|
- [License](#license) |
|
|
|
|
|
## π Overview |
|
|
|
|
|
This project was developed for the TDS (Tools in Data Science) course project, where the objective is to build an application that can autonomously solve multi-step quiz tasks involving: |
|
|
|
|
|
- **Data sourcing**: Scraping websites, calling APIs, downloading files |
|
|
- **Data preparation**: Cleaning text, PDFs, and various data formats |
|
|
- **Data analysis**: Filtering, aggregating, statistical analysis, ML models |
|
|
- **Data visualization**: Generating charts, narratives, and presentations |
|
|
|
|
|
The system receives quiz URLs via a REST API, navigates through multiple quiz pages, solves each task using LLM-powered reasoning and specialized tools, and submits answers back to the evaluation server. |
|
|
|
|
|
## ποΈ Architecture |
|
|
|
|
|
The project uses a **LangGraph state machine** architecture with the following components: |
|
|
|
|
|
``` |
|
|
βββββββββββββββ |
|
|
β FastAPI β β Receives POST requests with quiz URLs |
|
|
β Server β |
|
|
ββββββββ¬βββββββ |
|
|
β |
|
|
βΌ |
|
|
βββββββββββββββ |
|
|
β Agent β β LangGraph orchestrator with Gemini 2.5 Flash |
|
|
β (LLM) β |
|
|
ββββββββ¬βββββββ |
|
|
β |
|
|
ββββββββββββββ¬βββββββββββββ¬ββββββββββββββ¬βββββββββββββββ |
|
|
βΌ βΌ βΌ βΌ βΌ |
|
|
[Scraper] [Downloader] [Code Exec] [POST Req] [Add Deps] |
|
|
``` |
|
|
|
|
|
### Key Components: |
|
|
|
|
|
1. **FastAPI Server** (`main.py`): Handles incoming POST requests, validates secrets, and triggers the agent |
|
|
2. **LangGraph Agent** (`agent.py`): State machine that coordinates tool usage and decision-making |
|
|
3. **Tools Package** (`tools/`): Modular tools for different capabilities |
|
|
4. **LLM**: Google Gemini 2.5 Flash with rate limiting (9 requests per minute) |
|
|
|
|
|
## β¨ Features |
|
|
|
|
|
- β
**Autonomous multi-step problem solving**: Chains together multiple quiz pages |
|
|
- β
**Dynamic JavaScript rendering**: Uses Playwright for client-side rendered pages |
|
|
- β
**Code generation & execution**: Writes and runs Python code for data tasks |
|
|
- β
**Flexible data handling**: Downloads files, processes PDFs, CSVs, images, etc. |
|
|
- β
**Self-installing dependencies**: Automatically adds required Python packages |
|
|
- β
**Robust error handling**: Retries failed attempts within time limits |
|
|
- β
**Docker containerization**: Ready for deployment on HuggingFace Spaces or cloud platforms |
|
|
- β
**Rate limiting**: Respects API quotas with exponential backoff |
|
|
|
|
|
## π Project Structure |
|
|
|
|
|
``` |
|
|
LLM-Analysis-TDS-Project-2/ |
|
|
βββ agent.py # LangGraph state machine & orchestration |
|
|
βββ main.py # FastAPI server with /solve endpoint |
|
|
βββ pyproject.toml # Project dependencies & configuration |
|
|
βββ Dockerfile # Container image with Playwright |
|
|
βββ .env # Environment variables (not in repo) |
|
|
βββ tools/ |
|
|
β βββ __init__.py |
|
|
β βββ web_scraper.py # Playwright-based HTML renderer |
|
|
β βββ code_generate_and_run.py # Python code executor |
|
|
β βββ download_file.py # File downloader |
|
|
β βββ send_request.py # HTTP POST tool |
|
|
β βββ add_dependencies.py # Package installer |
|
|
βββ README.md |
|
|
``` |
|
|
|
|
|
## π¦ Installation |
|
|
|
|
|
### Prerequisites |
|
|
|
|
|
- Python 3.12 or higher |
|
|
- [uv](https://github.com/astral-sh/uv) package manager (recommended) or pip |
|
|
- Git |
|
|
|
|
|
### Step 1: Clone the Repository |
|
|
|
|
|
```bash |
|
|
git clone https://github.com/saivijayragav/LLM-Analysis-TDS-Project-2.git |
|
|
cd LLM-Analysis-TDS-Project-2 |
|
|
``` |
|
|
|
|
|
### Step 2: Install Dependencies |
|
|
|
|
|
#### Option A: Using `uv` (Recommended) |
|
|
|
|
|
|
|
|
Ensure you have uv installed, then sync the project: |
|
|
|
|
|
``` |
|
|
# Install uv if you haven't already |
|
|
pip install uv |
|
|
|
|
|
# Sync dependencies |
|
|
uv sync |
|
|
uv run playwright install chromium |
|
|
``` |
|
|
|
|
|
Start the FastAPI server: |
|
|
``` |
|
|
uv run main.py |
|
|
``` |
|
|
The server will start at ```http://0.0.0.0:7860```. |
|
|
|
|
|
#### Option B: Using `pip` |
|
|
|
|
|
```bash |
|
|
# Create virtual environment |
|
|
python -m venv venv |
|
|
.\venv\Scripts\activate # Windows |
|
|
# source venv/bin/activate # macOS/Linux |
|
|
|
|
|
# Install dependencies |
|
|
pip install -e . |
|
|
|
|
|
# Install Playwright browsers |
|
|
playwright install chromium |
|
|
``` |
|
|
|
|
|
## βοΈ Configuration |
|
|
|
|
|
### Environment Variables |
|
|
|
|
|
Create a `.env` file in the project root: |
|
|
|
|
|
```env |
|
|
# Your credentials from the Google Form submission |
|
|
EMAIL=your.email@example.com |
|
|
SECRET=your_secret_string |
|
|
|
|
|
# Google Gemini API Key |
|
|
GOOGLE_API_KEY=your_gemini_api_key_here |
|
|
``` |
|
|
|
|
|
### Getting a Gemini API Key |
|
|
|
|
|
1. Visit [Google AI Studio](https://aistudio.google.com/app/apikey) |
|
|
2. Create a new API key |
|
|
3. Copy it to your `.env` file |
|
|
|
|
|
## π Usage |
|
|
|
|
|
### Local Development |
|
|
|
|
|
Start the FastAPI server: |
|
|
|
|
|
```bash |
|
|
# If using uv |
|
|
uv run main.py |
|
|
|
|
|
# If using standard Python |
|
|
python main.py |
|
|
``` |
|
|
|
|
|
The server will start on `http://0.0.0.0:7860` |
|
|
|
|
|
### Testing the Endpoint |
|
|
|
|
|
Send a POST request to test your setup: |
|
|
|
|
|
```bash |
|
|
curl -X POST http://localhost:7860/solve \ |
|
|
-H "Content-Type: application/json" \ |
|
|
-d '{ |
|
|
"email": "your.email@example.com", |
|
|
"secret": "your_secret_string", |
|
|
"url": "https://tds-llm-analysis.s-anand.net/demo" |
|
|
}' |
|
|
``` |
|
|
|
|
|
Expected response: |
|
|
|
|
|
```json |
|
|
{ |
|
|
"status": "ok" |
|
|
} |
|
|
``` |
|
|
|
|
|
The agent will run in the background and solve the quiz chain autonomously. |
|
|
|
|
|
## π API Endpoints |
|
|
|
|
|
### `POST /solve` |
|
|
|
|
|
Receives quiz tasks and triggers the autonomous agent. |
|
|
|
|
|
**Request Body:** |
|
|
|
|
|
```json |
|
|
{ |
|
|
"email": "your.email@example.com", |
|
|
"secret": "your_secret_string", |
|
|
"url": "https://example.com/quiz-123" |
|
|
} |
|
|
``` |
|
|
|
|
|
**Responses:** |
|
|
|
|
|
| Status Code | Description | |
|
|
| ----------- | ------------------------------ | |
|
|
| `200` | Secret verified, agent started | |
|
|
| `400` | Invalid JSON payload | |
|
|
| `403` | Invalid secret | |
|
|
|
|
|
### `GET /healthz` |
|
|
|
|
|
Health check endpoint for monitoring. |
|
|
|
|
|
**Response:** |
|
|
|
|
|
```json |
|
|
{ |
|
|
"status": "ok", |
|
|
"uptime_seconds": 3600 |
|
|
} |
|
|
``` |
|
|
|
|
|
## π οΈ Tools & Capabilities |
|
|
|
|
|
The agent has access to the following tools: |
|
|
|
|
|
### 1. **Web Scraper** (`get_rendered_html`) |
|
|
|
|
|
- Uses Playwright to render JavaScript-heavy pages |
|
|
- Waits for network idle before extracting content |
|
|
- Returns fully rendered HTML for parsing |
|
|
|
|
|
### 2. **File Downloader** (`download_file`) |
|
|
|
|
|
- Downloads files (PDFs, CSVs, images, etc.) from direct URLs |
|
|
- Saves files to `LLMFiles/` directory |
|
|
- Returns the saved filename |
|
|
|
|
|
### 3. **Code Executor** (`run_code`) |
|
|
|
|
|
- Executes arbitrary Python code in an isolated subprocess |
|
|
- Returns stdout, stderr, and exit code |
|
|
- Useful for data processing, analysis, and visualization |
|
|
|
|
|
### 4. **POST Request** (`post_request`) |
|
|
|
|
|
- Sends JSON payloads to submission endpoints |
|
|
- Includes automatic error handling and response parsing |
|
|
- Prevents resubmission if answer is incorrect and time limit exceeded |
|
|
|
|
|
### 5. **Dependency Installer** (`add_dependencies`) |
|
|
|
|
|
- Dynamically installs Python packages as needed |
|
|
- Uses `uv add` for fast package resolution |
|
|
- Enables the agent to adapt to different task requirements |
|
|
|
|
|
## π³ Docker Deployment |
|
|
|
|
|
### Build the Image |
|
|
|
|
|
```bash |
|
|
docker build -t llm-analysis-agent . |
|
|
``` |
|
|
|
|
|
### Run the Container |
|
|
|
|
|
```bash |
|
|
docker run -p 7860:7860 \ |
|
|
-e EMAIL="your.email@example.com" \ |
|
|
-e SECRET="your_secret_string" \ |
|
|
-e GOOGLE_API_KEY="your_api_key" \ |
|
|
llm-analysis-agent |
|
|
``` |
|
|
|
|
|
### Deploy to HuggingFace Spaces |
|
|
|
|
|
1. Create a new Space with Docker SDK |
|
|
2. Push this repository to your Space |
|
|
3. Add secrets in Space settings: |
|
|
- `EMAIL` |
|
|
- `SECRET` |
|
|
- `GOOGLE_API_KEY` |
|
|
4. The Space will automatically build and deploy |
|
|
|
|
|
## π§ How It Works |
|
|
|
|
|
### 1. Request Reception |
|
|
|
|
|
- FastAPI receives a POST request with quiz URL |
|
|
- Validates the secret against environment variables |
|
|
- Returns 200 OK and starts the agent in the background |
|
|
|
|
|
### 2. Agent Initialization |
|
|
|
|
|
- LangGraph creates a state machine with two nodes: `agent` and `tools` |
|
|
- The initial state contains the quiz URL as a user message |
|
|
|
|
|
### 3. Task Loop |
|
|
|
|
|
The agent follows this loop: |
|
|
|
|
|
``` |
|
|
βββββββββββββββββββββββββββββββββββββββββββ |
|
|
β 1. LLM analyzes current state β |
|
|
β - Reads quiz page instructions β |
|
|
β - Plans tool usage β |
|
|
βββββββββββββββββββ¬ββββββββββββββββββββββββ |
|
|
βΌ |
|
|
βββββββββββββββββββββββββββββββββββββββββββ |
|
|
β 2. Tool execution β |
|
|
β - Scrapes page / downloads files β |
|
|
β - Runs analysis code β |
|
|
β - Submits answer β |
|
|
βββββββββββββββββββ¬ββββββββββββββββββββββββ |
|
|
βΌ |
|
|
βββββββββββββββββββββββββββββββββββββββββββ |
|
|
β 3. Response evaluation β |
|
|
β - Checks if answer is correct β |
|
|
β - Extracts next quiz URL (if exists) β |
|
|
βββββββββββββββββββ¬ββββββββββββββββββββββββ |
|
|
βΌ |
|
|
βββββββββββββββββββββββββββββββββββββββββββ |
|
|
β 4. Decision β |
|
|
β - If new URL exists: Loop to step 1 β |
|
|
β - If no URL: Return "END" β |
|
|
βββββββββββββββββββββββββββββββββββββββββββ |
|
|
``` |
|
|
|
|
|
### 4. State Management |
|
|
|
|
|
- All messages (user, assistant, tool) are stored in state |
|
|
- The LLM uses full history to make informed decisions |
|
|
- Recursion limit set to 200 to handle long quiz chains |
|
|
|
|
|
### 5. Completion |
|
|
|
|
|
- Agent returns "END" when no new URL is provided |
|
|
- Background task completes |
|
|
- Logs indicate success or failure |
|
|
|
|
|
## π Key Design Decisions |
|
|
|
|
|
1. **LangGraph over Sequential Execution**: Allows flexible routing and complex decision-making |
|
|
2. **Background Processing**: Prevents HTTP timeouts for long-running quiz chains |
|
|
3. **Tool Modularity**: Each tool is independent and can be tested/debugged separately |
|
|
4. **Rate Limiting**: Prevents API quota exhaustion (9 req/min for Gemini) |
|
|
5. **Code Execution**: Dynamically generates and runs Python for complex data tasks |
|
|
6. **Playwright for Scraping**: Handles JavaScript-rendered pages that `requests` cannot |
|
|
7. **uv for Dependencies**: Fast package resolution and installation |
|
|
|
|
|
## π License |
|
|
|
|
|
This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details. |
|
|
|
|
|
--- |
|
|
|
|
|
**Author**: Sai Vijay Ragav |
|
|
**Course**: Tools in Data Science (TDS) |
|
|
**Institution**: IIT Madras |
|
|
|
|
|
For questions or issues, please open an issue on the [GitHub repository](https://github.com/saivijayragav/LLM-Analysis-TDS-Project-2). |