MetaSearch / README.md
Tirath5504's picture
Update README.md
64acd41 verified
---
title: MetaSearch API
emoji: πŸ”¬
colorFrom: blue
colorTo: indigo
sdk: gradio
sdk_version: "5.9.1"
app_file: app.py
pinned: false
license: mit
---
# πŸ”¬ Automated Consensus Analysis API
A comprehensive HuggingFace Spaces API for automated peer review consensus analysis using LLMs and search-augmented verification.
## 🌟 Features
- **Critique Extraction**: Extract structured critique points from peer reviews using Gemini 2.0
- **Disagreement Detection**: Identify conflicts and disagreements between reviewers
- **Search-Augmented Verification**: Retrieve supporting/contradicting evidence from academic sources
- **Disagreement Resolution**: AI-powered resolution using DeepSeek-R1 with reasoning
- **Meta-Review Generation**: Comprehensive meta-reviews synthesizing all analyses
- **Rate Limiting**: 10 requests per minute per client
- **Queue Management**: Up to 3 concurrent pipeline executions
- **Progress Tracking**: Real-time status updates for long-running tasks
## πŸš€ Quick Start
### Local Development
1. **Clone and setup**
```bash
cd api
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -r requirements.txt
```
2. **Configure environment**
```bash
cp .env.example .env
# Edit .env with your API keys
```
3. **Run the application**
```bash
python app.py
```
Visit `http://localhost:7860` to access the Gradio interface.
### HuggingFace Spaces Deployment
1. **Create a new Space**
- Go to [HuggingFace Spaces](https://huggingface.co/spaces)
- Click "Create new Space"
- Select "Gradio" as SDK
2. **Upload files**
- Upload all files from the `api/` directory
- Ensure `requirements.txt` and `app.py` are in the root
3. **Configure secrets**
- Go to Space Settings β†’ Repository secrets
- Add the following secrets:
- `GEMINI_API_KEY`
- `OPENROUTER_API_KEY`
- `TAVILY_API_KEY`
- `SERPAPI_API_KEY`
4. **Deploy**
- The Space will automatically build and deploy
## πŸ“š API Endpoints
### Full Pipeline
**Endpoint**: `/api/full_pipeline`
**Method**: POST
**Description**: Run the complete consensus analysis pipeline
**Request Body**:
```json
{
"paper_title": "Visual Correspondence Hallucination",
"paper_abstract": "This paper investigates...",
"reviews": [
"Review 1: The methodology is sound but...",
"Review 2: While the experiments are comprehensive..."
]
}
```
**Response**:
```json
{
"request_id": "req_123456789",
"paper_title": "...",
"critique_points": [...],
"disagreements": [...],
"search_results": {...},
"resolution": [...],
"meta_review": "..."
}
```
### Individual Stages
#### Critique Extraction
**Endpoint**: `/api/critique_extraction`
**Method**: POST
```json
{
"reviews": ["Review 1 text...", "Review 2 text..."]
}
```
#### Disagreement Detection
**Endpoint**: `/api/disagreement_detection`
**Method**: POST
```json
{
"critiques": [
{"Methodology": [...], "Experiments": [...]},
{"Methodology": [...], "Experiments": [...]}
]
}
```
#### Search & Retrieval
**Endpoint**: `/api/search_retrieval`
**Method**: POST
```json
{
"paper_title": "...",
"paper_abstract": "...",
"critiques": [...]
}
```
#### Progress Tracking
**Endpoint**: `/api/progress/{request_id}`
**Method**: GET
**Response**:
```json
{
"stage": "search_retrieval",
"progress": 0.5,
"message": "Searching for relevant research...",
"timestamp": "2025-01-15T10:30:00"
}
```
## πŸ”§ Configuration
### Environment Variables
| Variable | Description | Default |
| ------------------------- | ------------------------------ | -------- |
| `GEMINI_API_KEY` | Google Gemini API key | Required |
| `OPENROUTER_API_KEY` | OpenRouter API key (DeepSeek) | Required |
| `TAVILY_API_KEY` | Tavily Search API key | Required |
| `SERPAPI_API_KEY` | SerpAPI key for Google Scholar | Optional |
| `MAX_REQUESTS_PER_MINUTE` | Rate limit | 10 |
| `MAX_CONCURRENT_TASKS` | Max parallel executions | 3 |
| `MAX_RETRIES` | Retry attempts on failure | 5 |
### Rate Limits
- **10 requests per minute** per client IP
- **Maximum 3 concurrent** pipeline executions
- **Queue size**: 20 pending requests
## πŸ—οΈ Architecture
```
api/
β”œβ”€β”€ app.py # Main Gradio application
β”œβ”€β”€ config.py # Configuration management
β”œβ”€β”€ requirements.txt # Python dependencies
β”œβ”€β”€ pipeline/ # Pipeline modules
β”‚ β”œβ”€β”€ critique_extraction.py # Gemini-based extraction
β”‚ β”œβ”€β”€ disagreement_detection.py
β”‚ β”œβ”€β”€ search_retrieval.py # LangChain search agent
β”‚ β”œβ”€β”€ disagreement_resolution.py # DeepSeek resolution
β”‚ └── meta_review.py
└── utils/ # Utility modules
β”œβ”€β”€ rate_limiter.py
β”œβ”€β”€ queue_manager.py
└── validators.py
```
## πŸ” Pipeline Stages
1. **Critique Extraction** (Gemini 2.0)
- Extracts structured critique points
- Categories: Methodology, Experiments, Clarity, Significance, Novelty
2. **Disagreement Detection** (Gemini 2.0)
- Compares all review pairs
- Assigns disagreement scores (0-1)
- Identifies specific conflict points
3. **Search & Retrieval** (LangChain + Multi-Search)
- SoTA research discovery
- Evidence validation
- Sources: Semantic Scholar, arXiv, Google Scholar, Tavily
4. **Disagreement Resolution** (DeepSeek-R1)
- Validates critique points
- Accepts/rejects based on evidence
- Provides resolution summaries
5. **Meta-Review Generation** (DeepSeek-R1)
- Synthesizes all analyses
- Provides final verdict
- Offers actionable recommendations
## πŸ“Š Example Usage
### Python
```python
import requests
response = requests.post(
"https://your-space.hf.space/api/full_pipeline",
json={
"paper_title": "Novel Approach to X",
"paper_abstract": "We propose...",
"reviews": [
"Reviewer 1: Strong methodology...",
"Reviewer 2: Weak experimental validation..."
]
}
)
result = response.json()
print(result["meta_review"])
```
### cURL
```bash
curl -X POST https://your-space.hf.space/api/full_pipeline \
-H "Content-Type: application/json" \
-d '{
"paper_title": "Novel Approach to X",
"paper_abstract": "We propose...",
"reviews": ["Review 1...", "Review 2..."]
}'
```
## πŸ“ License
See the main project LICENSE file.