Spaces:
Sleeping
Sleeping
| title: LLM Analysis Quiz Solver | |
| emoji: π | |
| colorFrom: red | |
| colorTo: blue | |
| sdk: docker | |
| pinned: false | |
| app_port: 7860 | |
| # LLM Analysis - Autonomous Quiz Solver Agent | |
| [](https://opensource.org/licenses/MIT) | |
| [](https://www.python.org/downloads/) | |
| [](https://fastapi.tiangolo.com/) | |
| An intelligent, autonomous agent built with LangGraph and LangChain that solves data-related quizzes involving web scraping, data processing, analysis, and visualization tasks. The system uses Google's Gemini 2.5 Flash model to orchestrate tool usage and make decisions. | |
| ## π Table of Contents | |
| - [Overview](#overview) | |
| - [Architecture](#architecture) | |
| - [Features](#features) | |
| - [Project Structure](#project-structure) | |
| - [Installation](#installation) | |
| - [Configuration](#configuration) | |
| - [Usage](#usage) | |
| - [API Endpoints](#api-endpoints) | |
| - [Tools & Capabilities](#tools--capabilities) | |
| - [Docker Deployment](#docker-deployment) | |
| - [How It Works](#how-it-works) | |
| - [License](#license) | |
| ## π Overview | |
| This project was developed for the TDS (Tools in Data Science) course project, where the objective is to build an application that can autonomously solve multi-step quiz tasks involving: | |
| - **Data sourcing**: Scraping websites, calling APIs, downloading files | |
| - **Data preparation**: Cleaning text, PDFs, and various data formats | |
| - **Data analysis**: Filtering, aggregating, statistical analysis, ML models | |
| - **Data visualization**: Generating charts, narratives, and presentations | |
| The system receives quiz URLs via a REST API, navigates through multiple quiz pages, solves each task using LLM-powered reasoning and specialized tools, and submits answers back to the evaluation server. | |
| ## ποΈ Architecture | |
| The project uses a **LangGraph state machine** architecture with the following components: | |
| ``` | |
| βββββββββββββββ | |
| β FastAPI β β Receives POST requests with quiz URLs | |
| β Server β | |
| ββββββββ¬βββββββ | |
| β | |
| βΌ | |
| βββββββββββββββ | |
| β Agent β β LangGraph orchestrator with Gemini 2.5 Flash | |
| β (LLM) β | |
| ββββββββ¬βββββββ | |
| β | |
| ββββββββββββββ¬βββββββββββββ¬ββββββββββββββ¬βββββββββββββββ | |
| βΌ βΌ βΌ βΌ βΌ | |
| [Scraper] [Downloader] [Code Exec] [POST Req] [Add Deps] | |
| ``` | |
| ### Key Components: | |
| 1. **FastAPI Server** (`main.py`): Handles incoming POST requests, validates secrets, and triggers the agent | |
| 2. **LangGraph Agent** (`agent.py`): State machine that coordinates tool usage and decision-making | |
| 3. **Tools Package** (`tools/`): Modular tools for different capabilities | |
| 4. **LLM**: Google Gemini 2.5 Flash with rate limiting (9 requests per minute) | |
| ## β¨ Features | |
| - β **Autonomous multi-step problem solving**: Chains together multiple quiz pages | |
| - β **Dynamic JavaScript rendering**: Uses Playwright for client-side rendered pages | |
| - β **Code generation & execution**: Writes and runs Python code for data tasks | |
| - β **Flexible data handling**: Downloads files, processes PDFs, CSVs, images, etc. | |
| - β **Self-installing dependencies**: Automatically adds required Python packages | |
| - β **Robust error handling**: Retries failed attempts within time limits | |
| - β **Docker containerization**: Ready for deployment on HuggingFace Spaces or cloud platforms | |
| - β **Rate limiting**: Respects API quotas with exponential backoff | |
| ## π Project Structure | |
| ``` | |
| DoMyDA/ | |
| βββ agent.py # LangGraph state machine & orchestration | |
| βββ main.py # FastAPI server with /solve endpoint | |
| βββ pyproject.toml # Project dependencies & configuration | |
| βββ Dockerfile # Container image with Playwright | |
| βββ .env # Environment variables (not in repo) | |
| βββ tools/ | |
| β βββ __init__.py | |
| β βββ web_scraper.py # Playwright-based HTML renderer | |
| β βββ code_generate_and_run.py # Python code executor | |
| β βββ download_file.py # File downloader | |
| β βββ send_request.py # HTTP POST tool | |
| β βββ add_dependencies.py # Package installer | |
| βββ README.md | |
| ``` | |
| ## π¦ Installation | |
| ### Prerequisites | |
| - Python 3.12 or higher | |
| - [uv](https://github.com/astral-sh/uv) package manager (recommended) or pip | |
| - Git | |
| ### Step 1: Clone the Repository | |
| ```bash | |
| git clone https://github.com/samarthnaikk/DoMyDA.git | |
| cd DoMyDA | |
| ``` | |
| ### Step 2: Install Dependencies | |
| #### Option A: Using `uv` (Recommended) | |
| Ensure you have uv installed, then sync the project: | |
| ``` | |
| # Install uv if you haven't already | |
| pip install uv | |
| # Sync dependencies | |
| uv sync | |
| uv run playwright install chromium | |
| ``` | |
| Start the FastAPI server: | |
| ``` | |
| uv run main.py | |
| ``` | |
| The server will start at ```http://0.0.0.0:7860```. | |
| #### Option B: Using `pip` | |
| ```bash | |
| # Create virtual environment | |
| python -m venv venv | |
| .\venv\Scripts\activate # Windows | |
| # source venv/bin/activate # macOS/Linux | |
| # Install dependencies | |
| pip install -e . | |
| # Install Playwright browsers | |
| playwright install chromium | |
| ``` | |
| ## βοΈ Configuration | |
| ### Environment Variables | |
| Create a `.env` file in the project root: | |
| ```env | |
| # Your credentials from the Google Form submission | |
| EMAIL=your.email@example.com | |
| SECRET=your_secret_string | |
| # Google Gemini API Key | |
| GOOGLE_API_KEY=your_gemini_api_key_here | |
| ``` | |
| ### Getting a Gemini API Key | |
| 1. Visit [Google AI Studio](https://aistudio.google.com/app/apikey) | |
| 2. Create a new API key | |
| 3. Copy it to your `.env` file | |
| ## π Usage | |
| ### Local Development | |
| Start the FastAPI server: | |
| ```bash | |
| # If using uv | |
| uv run main.py | |
| # If using standard Python | |
| python main.py | |
| ``` | |
| The server will start on `http://0.0.0.0:7860` | |
| ### Testing the Endpoint | |
| Send a POST request to test your setup: | |
| ```bash | |
| curl -X POST http://localhost:7860/solve \ | |
| -H "Content-Type: application/json" \ | |
| -d '{ | |
| "email": "your.email@example.com", | |
| "secret": "your_secret_string", | |
| "url": "https://tds-llm-analysis.s-anand.net/demo" | |
| }' | |
| ``` | |
| Expected response: | |
| ```json | |
| { | |
| "status": "ok" | |
| } | |
| ``` | |
| The agent will run in the background and solve the quiz chain autonomously. | |
| ## π API Endpoints | |
| ### `POST /solve` | |
| Receives quiz tasks and triggers the autonomous agent. | |
| **Request Body:** | |
| ```json | |
| { | |
| "email": "your.email@example.com", | |
| "secret": "your_secret_string", | |
| "url": "https://example.com/quiz-123" | |
| } | |
| ``` | |
| **Responses:** | |
| | Status Code | Description | | |
| | ----------- | ------------------------------ | | |
| | `200` | Secret verified, agent started | | |
| | `400` | Invalid JSON payload | | |
| | `403` | Invalid secret | | |
| ### `GET /healthz` | |
| Health check endpoint for monitoring. | |
| **Response:** | |
| ```json | |
| { | |
| "status": "ok", | |
| "uptime_seconds": 3600 | |
| } | |
| ``` | |
| ## π οΈ Tools & Capabilities | |
| The agent has access to the following tools: | |
| ### 1. **Web Scraper** (`get_rendered_html`) | |
| - Uses Playwright to render JavaScript-heavy pages | |
| - Waits for network idle before extracting content | |
| - Returns fully rendered HTML for parsing | |
| ### 2. **File Downloader** (`download_file`) | |
| - Downloads files (PDFs, CSVs, images, etc.) from direct URLs | |
| - Saves files to `LLMFiles/` directory | |
| - Returns the saved filename | |
| ### 3. **Code Executor** (`run_code`) | |
| - Executes arbitrary Python code in an isolated subprocess | |
| - Returns stdout, stderr, and exit code | |
| - Useful for data processing, analysis, and visualization | |
| ### 4. **POST Request** (`post_request`) | |
| - Sends JSON payloads to submission endpoints | |
| - Includes automatic error handling and response parsing | |
| - Prevents resubmission if answer is incorrect and time limit exceeded | |
| ### 5. **Dependency Installer** (`add_dependencies`) | |
| - Dynamically installs Python packages as needed | |
| - Uses `uv add` for fast package resolution | |
| - Enables the agent to adapt to different task requirements | |
| ## π³ Docker Deployment | |
| ### Build the Image | |
| ```bash | |
| docker build -t llm-analysis-agent . | |
| ``` | |
| ### Run the Container | |
| ```bash | |
| docker run -p 7860:7860 \ | |
| -e EMAIL="your.email@example.com" \ | |
| -e SECRET="your_secret_string" \ | |
| -e GOOGLE_API_KEY="your_api_key" \ | |
| llm-analysis-agent | |
| ``` | |
| ### Deploy to HuggingFace Spaces | |
| 1. Create a new Space with Docker SDK | |
| 2. Push this repository to your Space | |
| 3. Add secrets in Space settings: | |
| - `EMAIL` | |
| - `SECRET` | |
| - `GOOGLE_API_KEY` | |
| 4. The Space will automatically build and deploy | |
| ## π§ How It Works | |
| ### 1. Request Reception | |
| - FastAPI receives a POST request with quiz URL | |
| - Validates the secret against environment variables | |
| - Returns 200 OK and starts the agent in the background | |
| ### 2. Agent Initialization | |
| - LangGraph creates a state machine with two nodes: `agent` and `tools` | |
| - The initial state contains the quiz URL as a user message | |
| ### 3. Task Loop | |
| The agent follows this loop: | |
| ``` | |
| βββββββββββββββββββββββββββββββββββββββββββ | |
| β 1. LLM analyzes current state β | |
| β - Reads quiz page instructions β | |
| β - Plans tool usage β | |
| βββββββββββββββββββ¬ββββββββββββββββββββββββ | |
| βΌ | |
| βββββββββββββββββββββββββββββββββββββββββββ | |
| β 2. Tool execution β | |
| β - Scrapes page / downloads files β | |
| β - Runs analysis code β | |
| β - Submits answer β | |
| βββββββββββββββββββ¬ββββββββββββββββββββββββ | |
| βΌ | |
| βββββββββββββββββββββββββββββββββββββββββββ | |
| β 3. Response evaluation β | |
| β - Checks if answer is correct β | |
| β - Extracts next quiz URL (if exists) β | |
| βββββββββββββββββββ¬ββββββββββββββββββββββββ | |
| βΌ | |
| βββββββββββββββββββββββββββββββββββββββββββ | |
| β 4. Decision β | |
| β - If new URL exists: Loop to step 1 β | |
| β - If no URL: Return "END" β | |
| βββββββββββββββββββββββββββββββββββββββββββ | |
| ``` | |
| ### 4. State Management | |
| - All messages (user, assistant, tool) are stored in state | |
| - The LLM uses full history to make informed decisions | |
| - Recursion limit set to 200 to handle long quiz chains | |
| ### 5. Completion | |
| - Agent returns "END" when no new URL is provided | |
| - Background task completes | |
| - Logs indicate success or failure | |
| ## π Key Design Decisions | |
| 1. **LangGraph over Sequential Execution**: Allows flexible routing and complex decision-making | |
| 2. **Background Processing**: Prevents HTTP timeouts for long-running quiz chains | |
| 3. **Tool Modularity**: Each tool is independent and can be tested/debugged separately | |
| 4. **Rate Limiting**: Prevents API quota exhaustion (9 req/min for Gemini) | |
| 5. **Code Execution**: Dynamically generates and runs Python for complex data tasks | |
| 6. **Playwright for Scraping**: Handles JavaScript-rendered pages that `requests` cannot | |
| 7. **uv for Dependencies**: Fast package resolution and installation | |
| ## π License | |
| This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details. | |
| --- | |
| **Author**: samarthnaikk | |
| **Course**: Tools in Data Science (TDS) | |
| **Institution**: IIT Madras |