Spaces:

Tirath5504
/

MetaSearch

Sleeping

File size: 6,605 Bytes

34a0b9c
 
 
 
 
 
64acd41
34a0b9c
 
 
 
 
f2200ab

---
title: MetaSearch API
emoji: 🔬
colorFrom: blue
colorTo: indigo
sdk: gradio
sdk_version: "5.9.1"
app_file: app.py
pinned: false
license: mit
---

# 🔬 Automated Consensus Analysis API

A comprehensive HuggingFace Spaces API for automated peer review consensus analysis using LLMs and search-augmented verification.

## 🌟 Features

- **Critique Extraction**: Extract structured critique points from peer reviews using Gemini 2.0
- **Disagreement Detection**: Identify conflicts and disagreements between reviewers
- **Search-Augmented Verification**: Retrieve supporting/contradicting evidence from academic sources
- **Disagreement Resolution**: AI-powered resolution using DeepSeek-R1 with reasoning
- **Meta-Review Generation**: Comprehensive meta-reviews synthesizing all analyses
- **Rate Limiting**: 10 requests per minute per client
- **Queue Management**: Up to 3 concurrent pipeline executions
- **Progress Tracking**: Real-time status updates for long-running tasks

## 🚀 Quick Start

### Local Development

1. **Clone and setup**

```bash
cd api
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt
```

2. **Configure environment**

```bash
cp .env.example .env
# Edit .env with your API keys
```

3. **Run the application**

```bash
python app.py
```

Visit `http://localhost:7860` to access the Gradio interface.

### HuggingFace Spaces Deployment

1. **Create a new Space**

   - Go to [HuggingFace Spaces](https://huggingface.co/spaces)
   - Click "Create new Space"
   - Select "Gradio" as SDK

2. **Upload files**

   - Upload all files from the `api/` directory
   - Ensure `requirements.txt` and `app.py` are in the root

3. **Configure secrets**

   - Go to Space Settings → Repository secrets
   - Add the following secrets:
     - `GEMINI_API_KEY`
     - `OPENROUTER_API_KEY`
     - `TAVILY_API_KEY`
     - `SERPAPI_API_KEY`

4. **Deploy**
   - The Space will automatically build and deploy

## 📚 API Endpoints

### Full Pipeline

**Endpoint**: `/api/full_pipeline`  
**Method**: POST  
**Description**: Run the complete consensus analysis pipeline

**Request Body**:

```json
{
  "paper_title": "Visual Correspondence Hallucination",
  "paper_abstract": "This paper investigates...",
  "reviews": [
    "Review 1: The methodology is sound but...",
    "Review 2: While the experiments are comprehensive..."
  ]
}
```

**Response**:

```json
{
  "request_id": "req_123456789",
  "paper_title": "...",
  "critique_points": [...],
  "disagreements": [...],
  "search_results": {...},
  "resolution": [...],
  "meta_review": "..."
}
```

### Individual Stages

#### Critique Extraction

**Endpoint**: `/api/critique_extraction`  
**Method**: POST

```json
{
  "reviews": ["Review 1 text...", "Review 2 text..."]
}
```

#### Disagreement Detection

**Endpoint**: `/api/disagreement_detection`  
**Method**: POST

```json
{
  "critiques": [
    {"Methodology": [...], "Experiments": [...]},
    {"Methodology": [...], "Experiments": [...]}
  ]
}
```

#### Search & Retrieval

**Endpoint**: `/api/search_retrieval`  
**Method**: POST

```json
{
  "paper_title": "...",
  "paper_abstract": "...",
  "critiques": [...]
}
```

#### Progress Tracking

**Endpoint**: `/api/progress/{request_id}`  
**Method**: GET

**Response**:

```json
{
  "stage": "search_retrieval",
  "progress": 0.5,
  "message": "Searching for relevant research...",
  "timestamp": "2025-01-15T10:30:00"
}
```

## 🔧 Configuration

### Environment Variables

| Variable                  | Description                    | Default  |
| ------------------------- | ------------------------------ | -------- |
| `GEMINI_API_KEY`          | Google Gemini API key          | Required |
| `OPENROUTER_API_KEY`      | OpenRouter API key (DeepSeek)  | Required |
| `TAVILY_API_KEY`          | Tavily Search API key          | Required |
| `SERPAPI_API_KEY`         | SerpAPI key for Google Scholar | Optional |
| `MAX_REQUESTS_PER_MINUTE` | Rate limit                     | 10       |
| `MAX_CONCURRENT_TASKS`    | Max parallel executions        | 3        |
| `MAX_RETRIES`             | Retry attempts on failure      | 5        |

### Rate Limits

- **10 requests per minute** per client IP
- **Maximum 3 concurrent** pipeline executions
- **Queue size**: 20 pending requests

## 🏗️ Architecture

```
api/
├── app.py                      # Main Gradio application
├── config.py                   # Configuration management
├── requirements.txt            # Python dependencies
├── pipeline/                   # Pipeline modules
│   ├── critique_extraction.py  # Gemini-based extraction
│   ├── disagreement_detection.py
│   ├── search_retrieval.py     # LangChain search agent
│   ├── disagreement_resolution.py  # DeepSeek resolution
│   └── meta_review.py
└── utils/                      # Utility modules
    ├── rate_limiter.py
    ├── queue_manager.py
    └── validators.py
```

## 🔍 Pipeline Stages

1. **Critique Extraction** (Gemini 2.0)

   - Extracts structured critique points
   - Categories: Methodology, Experiments, Clarity, Significance, Novelty

2. **Disagreement Detection** (Gemini 2.0)

   - Compares all review pairs
   - Assigns disagreement scores (0-1)
   - Identifies specific conflict points

3. **Search & Retrieval** (LangChain + Multi-Search)

   - SoTA research discovery
   - Evidence validation
   - Sources: Semantic Scholar, arXiv, Google Scholar, Tavily

4. **Disagreement Resolution** (DeepSeek-R1)

   - Validates critique points
   - Accepts/rejects based on evidence
   - Provides resolution summaries

5. **Meta-Review Generation** (DeepSeek-R1)
   - Synthesizes all analyses
   - Provides final verdict
   - Offers actionable recommendations

## 📊 Example Usage

### Python

```python
import requests

response = requests.post(
    "https://your-space.hf.space/api/full_pipeline",
    json={
        "paper_title": "Novel Approach to X",
        "paper_abstract": "We propose...",
        "reviews": [
            "Reviewer 1: Strong methodology...",
            "Reviewer 2: Weak experimental validation..."
        ]
    }
)

result = response.json()
print(result["meta_review"])
```

### cURL

```bash
curl -X POST https://your-space.hf.space/api/full_pipeline \
  -H "Content-Type: application/json" \
  -d '{
    "paper_title": "Novel Approach to X",
    "paper_abstract": "We propose...",
    "reviews": ["Review 1...", "Review 2..."]
  }'
```

## 📝 License

See the main project LICENSE file.