ShoaibSSM commited on
Commit
331099c
Β·
verified Β·
1 Parent(s): fe410bc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +386 -5
README.md CHANGED
@@ -1,11 +1,392 @@
1
  ---
2
- title: LLM Analysis TDS Project 2
3
- emoji: πŸ‘
4
- colorFrom: purple
5
- colorTo: gray
6
  sdk: docker
7
  pinned: false
 
8
  license: apache-2.0
9
  ---
10
 
11
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: LLM Analysis Quiz Solver
3
+ emoji: πŸƒ
4
+ colorFrom: red
5
+ colorTo: blue
6
  sdk: docker
7
  pinned: false
8
+ app_port: 7860
9
  license: apache-2.0
10
  ---
11
 
12
+ # LLM Analysis - Autonomous Quiz Solver Agent
13
+
14
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
15
+ [![Python 3.12+](https://img.shields.io/badge/python-3.12+-blue.svg)](https://www.python.org/downloads/)
16
+ [![FastAPI](https://img.shields.io/badge/FastAPI-0.121.3+-green.svg)](https://fastapi.tiangolo.com/)
17
+
18
+ An intelligent, autonomous agent built with LangGraph and LangChain that solves data-related quizzes involving web scraping, data processing, analysis, and visualization tasks. The system uses Google's Gemini 2.5 Flash model to orchestrate tool usage and make decisions.
19
+
20
+ ## πŸ“‹ Table of Contents
21
+
22
+ - [Overview](#overview)
23
+ - [Architecture](#architecture)
24
+ - [Features](#features)
25
+ - [Project Structure](#project-structure)
26
+ - [Installation](#installation)
27
+ - [Configuration](#configuration)
28
+ - [Usage](#usage)
29
+ - [API Endpoints](#api-endpoints)
30
+ - [Tools & Capabilities](#tools--capabilities)
31
+ - [Docker Deployment](#docker-deployment)
32
+ - [How It Works](#how-it-works)
33
+ - [License](#license)
34
+
35
+ ## πŸ” Overview
36
+
37
+ This project was developed for the TDS (Tools in Data Science) course project, where the objective is to build an application that can autonomously solve multi-step quiz tasks involving:
38
+
39
+ - **Data sourcing**: Scraping websites, calling APIs, downloading files
40
+ - **Data preparation**: Cleaning text, PDFs, and various data formats
41
+ - **Data analysis**: Filtering, aggregating, statistical analysis, ML models
42
+ - **Data visualization**: Generating charts, narratives, and presentations
43
+
44
+ The system receives quiz URLs via a REST API, navigates through multiple quiz pages, solves each task using LLM-powered reasoning and specialized tools, and submits answers back to the evaluation server.
45
+
46
+ ## πŸ—οΈ Architecture
47
+
48
+ The project uses a **LangGraph state machine** architecture with the following components:
49
+
50
+ ```
51
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
52
+ β”‚ FastAPI β”‚ ← Receives POST requests with quiz URLs
53
+ β”‚ Server β”‚
54
+ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜
55
+ β”‚
56
+ β–Ό
57
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
58
+ β”‚ Agent β”‚ ← LangGraph orchestrator with Gemini 2.5 Flash
59
+ β”‚ (LLM) β”‚
60
+ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜
61
+ β”‚
62
+ β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
63
+ β–Ό β–Ό β–Ό β–Ό β–Ό
64
+ [Scraper] [Downloader] [Code Exec] [POST Req] [Add Deps]
65
+ ```
66
+
67
+ ### Key Components:
68
+
69
+ 1. **FastAPI Server** (`main.py`): Handles incoming POST requests, validates secrets, and triggers the agent
70
+ 2. **LangGraph Agent** (`agent.py`): State machine that coordinates tool usage and decision-making
71
+ 3. **Tools Package** (`tools/`): Modular tools for different capabilities
72
+ 4. **LLM**: Google Gemini 2.5 Flash with rate limiting (9 requests per minute)
73
+
74
+ ## ✨ Features
75
+
76
+ - βœ… **Autonomous multi-step problem solving**: Chains together multiple quiz pages
77
+ - βœ… **Dynamic JavaScript rendering**: Uses Playwright for client-side rendered pages
78
+ - βœ… **Code generation & execution**: Writes and runs Python code for data tasks
79
+ - βœ… **Flexible data handling**: Downloads files, processes PDFs, CSVs, images, etc.
80
+ - βœ… **Self-installing dependencies**: Automatically adds required Python packages
81
+ - βœ… **Robust error handling**: Retries failed attempts within time limits
82
+ - βœ… **Docker containerization**: Ready for deployment on HuggingFace Spaces or cloud platforms
83
+ - βœ… **Rate limiting**: Respects API quotas with exponential backoff
84
+
85
+ ## πŸ“ Project Structure
86
+
87
+ ```
88
+ LLM-Analysis-TDS-Project-2/
89
+ β”œβ”€β”€ agent.py # LangGraph state machine & orchestration
90
+ β”œβ”€β”€ main.py # FastAPI server with /solve endpoint
91
+ β”œβ”€β”€ pyproject.toml # Project dependencies & configuration
92
+ β”œβ”€β”€ Dockerfile # Container image with Playwright
93
+ β”œβ”€β”€ .env # Environment variables (not in repo)
94
+ β”œβ”€β”€ tools/
95
+ β”‚ β”œβ”€β”€ __init__.py
96
+ β”‚ β”œβ”€β”€ web_scraper.py # Playwright-based HTML renderer
97
+ β”‚ β”œβ”€β”€ code_generate_and_run.py # Python code executor
98
+ β”‚ β”œβ”€β”€ download_file.py # File downloader
99
+ β”‚ β”œβ”€β”€ send_request.py # HTTP POST tool
100
+ β”‚ └── add_dependencies.py # Package installer
101
+ └── README.md
102
+ ```
103
+
104
+ ## πŸ“¦ Installation
105
+
106
+ ### Prerequisites
107
+
108
+ - Python 3.12 or higher
109
+ - [uv](https://github.com/astral-sh/uv) package manager (recommended) or pip
110
+ - Git
111
+
112
+ ### Step 1: Clone the Repository
113
+
114
+ ```bash
115
+ git clone https://github.com/saivijayragav/LLM-Analysis-TDS-Project-2.git
116
+ cd LLM-Analysis-TDS-Project-2
117
+ ```
118
+
119
+ ### Step 2: Install Dependencies
120
+
121
+ #### Option A: Using `uv` (Recommended)
122
+
123
+
124
+ Ensure you have uv installed, then sync the project:
125
+
126
+ ```
127
+ # Install uv if you haven't already
128
+ pip install uv
129
+
130
+ # Sync dependencies
131
+ uv sync
132
+ uv run playwright install chromium
133
+ ```
134
+
135
+ Start the FastAPI server:
136
+ ```
137
+ uv run main.py
138
+ ```
139
+ The server will start at ```http://0.0.0.0:7860```.
140
+
141
+ #### Option B: Using `pip`
142
+
143
+ ```bash
144
+ # Create virtual environment
145
+ python -m venv venv
146
+ .\venv\Scripts\activate # Windows
147
+ # source venv/bin/activate # macOS/Linux
148
+
149
+ # Install dependencies
150
+ pip install -e .
151
+
152
+ # Install Playwright browsers
153
+ playwright install chromium
154
+ ```
155
+
156
+ ## βš™οΈ Configuration
157
+
158
+ ### Environment Variables
159
+
160
+ Create a `.env` file in the project root:
161
+
162
+ ```env
163
+ # Your credentials from the Google Form submission
164
+ EMAIL=your.email@example.com
165
+ SECRET=your_secret_string
166
+
167
+ # Google Gemini API Key
168
+ GOOGLE_API_KEY=your_gemini_api_key_here
169
+ ```
170
+
171
+ ### Getting a Gemini API Key
172
+
173
+ 1. Visit [Google AI Studio](https://aistudio.google.com/app/apikey)
174
+ 2. Create a new API key
175
+ 3. Copy it to your `.env` file
176
+
177
+ ## πŸš€ Usage
178
+
179
+ ### Local Development
180
+
181
+ Start the FastAPI server:
182
+
183
+ ```bash
184
+ # If using uv
185
+ uv run main.py
186
+
187
+ # If using standard Python
188
+ python main.py
189
+ ```
190
+
191
+ The server will start on `http://0.0.0.0:7860`
192
+
193
+ ### Testing the Endpoint
194
+
195
+ Send a POST request to test your setup:
196
+
197
+ ```bash
198
+ curl -X POST http://localhost:7860/solve \
199
+ -H "Content-Type: application/json" \
200
+ -d '{
201
+ "email": "your.email@example.com",
202
+ "secret": "your_secret_string",
203
+ "url": "https://tds-llm-analysis.s-anand.net/demo"
204
+ }'
205
+ ```
206
+
207
+ Expected response:
208
+
209
+ ```json
210
+ {
211
+ "status": "ok"
212
+ }
213
+ ```
214
+
215
+ The agent will run in the background and solve the quiz chain autonomously.
216
+
217
+ ## 🌐 API Endpoints
218
+
219
+ ### `POST /solve`
220
+
221
+ Receives quiz tasks and triggers the autonomous agent.
222
+
223
+ **Request Body:**
224
+
225
+ ```json
226
+ {
227
+ "email": "your.email@example.com",
228
+ "secret": "your_secret_string",
229
+ "url": "https://example.com/quiz-123"
230
+ }
231
+ ```
232
+
233
+ **Responses:**
234
+
235
+ | Status Code | Description |
236
+ | ----------- | ------------------------------ |
237
+ | `200` | Secret verified, agent started |
238
+ | `400` | Invalid JSON payload |
239
+ | `403` | Invalid secret |
240
+
241
+ ### `GET /healthz`
242
+
243
+ Health check endpoint for monitoring.
244
+
245
+ **Response:**
246
+
247
+ ```json
248
+ {
249
+ "status": "ok",
250
+ "uptime_seconds": 3600
251
+ }
252
+ ```
253
+
254
+ ## πŸ› οΈ Tools & Capabilities
255
+
256
+ The agent has access to the following tools:
257
+
258
+ ### 1. **Web Scraper** (`get_rendered_html`)
259
+
260
+ - Uses Playwright to render JavaScript-heavy pages
261
+ - Waits for network idle before extracting content
262
+ - Returns fully rendered HTML for parsing
263
+
264
+ ### 2. **File Downloader** (`download_file`)
265
+
266
+ - Downloads files (PDFs, CSVs, images, etc.) from direct URLs
267
+ - Saves files to `LLMFiles/` directory
268
+ - Returns the saved filename
269
+
270
+ ### 3. **Code Executor** (`run_code`)
271
+
272
+ - Executes arbitrary Python code in an isolated subprocess
273
+ - Returns stdout, stderr, and exit code
274
+ - Useful for data processing, analysis, and visualization
275
+
276
+ ### 4. **POST Request** (`post_request`)
277
+
278
+ - Sends JSON payloads to submission endpoints
279
+ - Includes automatic error handling and response parsing
280
+ - Prevents resubmission if answer is incorrect and time limit exceeded
281
+
282
+ ### 5. **Dependency Installer** (`add_dependencies`)
283
+
284
+ - Dynamically installs Python packages as needed
285
+ - Uses `uv add` for fast package resolution
286
+ - Enables the agent to adapt to different task requirements
287
+
288
+ ## 🐳 Docker Deployment
289
+
290
+ ### Build the Image
291
+
292
+ ```bash
293
+ docker build -t llm-analysis-agent .
294
+ ```
295
+
296
+ ### Run the Container
297
+
298
+ ```bash
299
+ docker run -p 7860:7860 \
300
+ -e EMAIL="your.email@example.com" \
301
+ -e SECRET="your_secret_string" \
302
+ -e GOOGLE_API_KEY="your_api_key" \
303
+ llm-analysis-agent
304
+ ```
305
+
306
+ ### Deploy to HuggingFace Spaces
307
+
308
+ 1. Create a new Space with Docker SDK
309
+ 2. Push this repository to your Space
310
+ 3. Add secrets in Space settings:
311
+ - `EMAIL`
312
+ - `SECRET`
313
+ - `GOOGLE_API_KEY`
314
+ 4. The Space will automatically build and deploy
315
+
316
+ ## 🧠 How It Works
317
+
318
+ ### 1. Request Reception
319
+
320
+ - FastAPI receives a POST request with quiz URL
321
+ - Validates the secret against environment variables
322
+ - Returns 200 OK and starts the agent in the background
323
+
324
+ ### 2. Agent Initialization
325
+
326
+ - LangGraph creates a state machine with two nodes: `agent` and `tools`
327
+ - The initial state contains the quiz URL as a user message
328
+
329
+ ### 3. Task Loop
330
+
331
+ The agent follows this loop:
332
+
333
+ ```
334
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
335
+ β”‚ 1. LLM analyzes current state β”‚
336
+ β”‚ - Reads quiz page instructions β”‚
337
+ β”‚ - Plans tool usage β”‚
338
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
339
+ β–Ό
340
+ β”ŒοΏ½οΏ½οΏ½β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
341
+ β”‚ 2. Tool execution β”‚
342
+ β”‚ - Scrapes page / downloads files β”‚
343
+ β”‚ - Runs analysis code β”‚
344
+ β”‚ - Submits answer β”‚
345
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
346
+ β–Ό
347
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
348
+ β”‚ 3. Response evaluation β”‚
349
+ β”‚ - Checks if answer is correct β”‚
350
+ β”‚ - Extracts next quiz URL (if exists) β”‚
351
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
352
+ β–Ό
353
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
354
+ β”‚ 4. Decision β”‚
355
+ β”‚ - If new URL exists: Loop to step 1 β”‚
356
+ β”‚ - If no URL: Return "END" β”‚
357
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
358
+ ```
359
+
360
+ ### 4. State Management
361
+
362
+ - All messages (user, assistant, tool) are stored in state
363
+ - The LLM uses full history to make informed decisions
364
+ - Recursion limit set to 200 to handle long quiz chains
365
+
366
+ ### 5. Completion
367
+
368
+ - Agent returns "END" when no new URL is provided
369
+ - Background task completes
370
+ - Logs indicate success or failure
371
+
372
+ ## πŸ“ Key Design Decisions
373
+
374
+ 1. **LangGraph over Sequential Execution**: Allows flexible routing and complex decision-making
375
+ 2. **Background Processing**: Prevents HTTP timeouts for long-running quiz chains
376
+ 3. **Tool Modularity**: Each tool is independent and can be tested/debugged separately
377
+ 4. **Rate Limiting**: Prevents API quota exhaustion (9 req/min for Gemini)
378
+ 5. **Code Execution**: Dynamically generates and runs Python for complex data tasks
379
+ 6. **Playwright for Scraping**: Handles JavaScript-rendered pages that `requests` cannot
380
+ 7. **uv for Dependencies**: Fast package resolution and installation
381
+
382
+ ## πŸ“„ License
383
+
384
+ This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.
385
+
386
+ ---
387
+
388
+ **Author**: Sai Vijay Ragav
389
+ **Course**: Tools in Data Science (TDS)
390
+ **Institution**: IIT Madras
391
+
392
+ For questions or issues, please open an issue on the [GitHub repository](https://github.com/saivijayragav/LLM-Analysis-TDS-Project-2).