Spaces:

Tirath5504
/

MetaSearch

Sleeping

App Files Files Community

MetaSearch / README.md

Tirath5504

Update README.md

64acd41 verified about 2 months ago

preview code

raw

history blame contribute delete

6.61 kB

	---
	title: MetaSearch API
	emoji: 🔬
	colorFrom: blue
	colorTo: indigo
	sdk: gradio
	sdk_version: "5.9.1"
	app_file: app.py
	pinned: false
	license: mit
	---

	# 🔬 Automated Consensus Analysis API

	A comprehensive HuggingFace Spaces API for automated peer review consensus analysis using LLMs and search-augmented verification.

	## 🌟 Features

	- Critique Extraction: Extract structured critique points from peer reviews using Gemini 2.0
	- Disagreement Detection: Identify conflicts and disagreements between reviewers
	- Search-Augmented Verification: Retrieve supporting/contradicting evidence from academic sources
	- Disagreement Resolution: AI-powered resolution using DeepSeek-R1 with reasoning
	- Meta-Review Generation: Comprehensive meta-reviews synthesizing all analyses
	- Rate Limiting: 10 requests per minute per client
	- Queue Management: Up to 3 concurrent pipeline executions
	- Progress Tracking: Real-time status updates for long-running tasks

	## 🚀 Quick Start

	### Local Development

	1. Clone and setup

	```bash
	cd api
	python -m venv venv
	source venv/bin/activate # On Windows: venv\Scripts\activate
	pip install -r requirements.txt
	```

	2. Configure environment

	```bash
	cp .env.example .env
	# Edit .env with your API keys
	```

	3. Run the application

	```bash
	python app.py
	```

	Visit `http://localhost:7860` to access the Gradio interface.

	### HuggingFace Spaces Deployment

	1. Create a new Space

	- Go to [HuggingFace Spaces](https://huggingface.co/spaces)
	- Click "Create new Space"
	- Select "Gradio" as SDK

	2. Upload files

	- Upload all files from the `api/` directory
	- Ensure `requirements.txt` and `app.py` are in the root

	3. Configure secrets

	- Go to Space Settings → Repository secrets
	- Add the following secrets:
	- `GEMINI_API_KEY`
	- `OPENROUTER_API_KEY`
	- `TAVILY_API_KEY`
	- `SERPAPI_API_KEY`

	4. Deploy
	- The Space will automatically build and deploy

	## 📚 API Endpoints

	### Full Pipeline

	Endpoint: `/api/full_pipeline`
	Method: POST
	Description: Run the complete consensus analysis pipeline

	Request Body:

	```json
	{
	"paper_title": "Visual Correspondence Hallucination",
	"paper_abstract": "This paper investigates...",
	"reviews": [
	"Review 1: The methodology is sound but...",
	"Review 2: While the experiments are comprehensive..."
	]
	}
	```

	Response:

	```json
	{
	"request_id": "req_123456789",
	"paper_title": "...",
	"critique_points": [...],
	"disagreements": [...],
	"search_results": {...},
	"resolution": [...],
	"meta_review": "..."
	}
	```

	### Individual Stages

	#### Critique Extraction

	Endpoint: `/api/critique_extraction`
	Method: POST

	```json
	{
	"reviews": ["Review 1 text...", "Review 2 text..."]
	}
	```

	#### Disagreement Detection

	Endpoint: `/api/disagreement_detection`
	Method: POST

	```json
	{
	"critiques": [
	{"Methodology": [...], "Experiments": [...]},
	{"Methodology": [...], "Experiments": [...]}
	]
	}
	```

	#### Search & Retrieval

	Endpoint: `/api/search_retrieval`
	Method: POST

	```json
	{
	"paper_title": "...",
	"paper_abstract": "...",
	"critiques": [...]
	}
	```

	#### Progress Tracking

	Endpoint: `/api/progress/{request_id}`
	Method: GET

	Response:

	```json
	{
	"stage": "search_retrieval",
	"progress": 0.5,
	"message": "Searching for relevant research...",
	"timestamp": "2025-01-15T10:30:00"
	}
	```

	## 🔧 Configuration

	### Environment Variables

	\| Variable \| Description \| Default \|
	\| ------------------------- \| ------------------------------ \| -------- \|
	\| `GEMINI_API_KEY` \| Google Gemini API key \| Required \|
	\| `OPENROUTER_API_KEY` \| OpenRouter API key (DeepSeek) \| Required \|
	\| `TAVILY_API_KEY` \| Tavily Search API key \| Required \|
	\| `SERPAPI_API_KEY` \| SerpAPI key for Google Scholar \| Optional \|
	\| `MAX_REQUESTS_PER_MINUTE` \| Rate limit \| 10 \|
	\| `MAX_CONCURRENT_TASKS` \| Max parallel executions \| 3 \|
	\| `MAX_RETRIES` \| Retry attempts on failure \| 5 \|

	### Rate Limits

	- 10 requests per minute per client IP
	- Maximum 3 concurrent pipeline executions
	- Queue size: 20 pending requests

	## 🏗️ Architecture

	```
	api/
	├── app.py # Main Gradio application
	├── config.py # Configuration management
	├── requirements.txt # Python dependencies
	├── pipeline/ # Pipeline modules
	│ ├── critique_extraction.py # Gemini-based extraction
	│ ├── disagreement_detection.py
	│ ├── search_retrieval.py # LangChain search agent
	│ ├── disagreement_resolution.py # DeepSeek resolution
	│ └── meta_review.py
	└── utils/ # Utility modules
	├── rate_limiter.py
	├── queue_manager.py
	└── validators.py
	```

	## 🔍 Pipeline Stages

	1. Critique Extraction (Gemini 2.0)

	- Extracts structured critique points
	- Categories: Methodology, Experiments, Clarity, Significance, Novelty

	2. Disagreement Detection (Gemini 2.0)

	- Compares all review pairs
	- Assigns disagreement scores (0-1)
	- Identifies specific conflict points

	3. Search & Retrieval (LangChain + Multi-Search)

	- SoTA research discovery
	- Evidence validation
	- Sources: Semantic Scholar, arXiv, Google Scholar, Tavily

	4. Disagreement Resolution (DeepSeek-R1)

	- Validates critique points
	- Accepts/rejects based on evidence
	- Provides resolution summaries

	5. Meta-Review Generation (DeepSeek-R1)
	- Synthesizes all analyses
	- Provides final verdict
	- Offers actionable recommendations

	## 📊 Example Usage

	### Python

	```python
	import requests

	response = requests.post(
	"https://your-space.hf.space/api/full_pipeline",
	json={
	"paper_title": "Novel Approach to X",
	"paper_abstract": "We propose...",
	"reviews": [
	"Reviewer 1: Strong methodology...",
	"Reviewer 2: Weak experimental validation..."
	]
	}
	)

	result = response.json()
	print(result["meta_review"])
	```

	### cURL

	```bash
	curl -X POST https://your-space.hf.space/api/full_pipeline \
	-H "Content-Type: application/json" \
	-d '{
	"paper_title": "Novel Approach to X",
	"paper_abstract": "We propose...",
	"reviews": ["Review 1...", "Review 2..."]
	}'
	```

	## 📝 License

	See the main project LICENSE file.