Spaces:

Tirath5504
/

MetaSearch

Sleeping

App Files Files Community

MetaSearch / README.md

Tirath5504

Update README.md

64acd41 verified about 2 months ago

preview code

raw

history blame contribute delete

6.61 kB

A newer version of the Gradio SDK is available: 6.3.0

Upgrade

metadata

title: MetaSearch API
emoji: 🔬
colorFrom: blue
colorTo: indigo
sdk: gradio
sdk_version: 5.9.1
app_file: app.py
pinned: false
license: mit

🔬 Automated Consensus Analysis API

A comprehensive HuggingFace Spaces API for automated peer review consensus analysis using LLMs and search-augmented verification.

🌟 Features

Critique Extraction: Extract structured critique points from peer reviews using Gemini 2.0
Disagreement Detection: Identify conflicts and disagreements between reviewers
Search-Augmented Verification: Retrieve supporting/contradicting evidence from academic sources
Disagreement Resolution: AI-powered resolution using DeepSeek-R1 with reasoning
Meta-Review Generation: Comprehensive meta-reviews synthesizing all analyses
Rate Limiting: 10 requests per minute per client
Queue Management: Up to 3 concurrent pipeline executions
Progress Tracking: Real-time status updates for long-running tasks

🚀 Quick Start

Local Development

Clone and setup

cd api
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt

Configure environment

cp .env.example .env
# Edit .env with your API keys

Run the application

python app.py

Visit http://localhost:7860 to access the Gradio interface.

HuggingFace Spaces Deployment

Create a new Space
- Go to HuggingFace Spaces
- Click "Create new Space"
- Select "Gradio" as SDK
Upload files
- Upload all files from the api/ directory
- Ensure requirements.txt and app.py are in the root
Configure secrets
- Go to Space Settings → Repository secrets
- Add the following secrets:
  - GEMINI_API_KEY
  - OPENROUTER_API_KEY
  - TAVILY_API_KEY
  - SERPAPI_API_KEY
Deploy
- The Space will automatically build and deploy

📚 API Endpoints

Full Pipeline

Endpoint: /api/full_pipeline
Method: POST
Description: Run the complete consensus analysis pipeline

Request Body:

{
  "paper_title": "Visual Correspondence Hallucination",
  "paper_abstract": "This paper investigates...",
  "reviews": [
    "Review 1: The methodology is sound but...",
    "Review 2: While the experiments are comprehensive..."
  ]
}

Response:

{
  "request_id": "req_123456789",
  "paper_title": "...",
  "critique_points": [...],
  "disagreements": [...],
  "search_results": {...},
  "resolution": [...],
  "meta_review": "..."
}

Individual Stages

Critique Extraction

Endpoint: /api/critique_extraction
Method: POST

{
  "reviews": ["Review 1 text...", "Review 2 text..."]
}

Disagreement Detection

Endpoint: /api/disagreement_detection
Method: POST

{
  "critiques": [
    {"Methodology": [...], "Experiments": [...]},
    {"Methodology": [...], "Experiments": [...]}
  ]
}

Search & Retrieval

Endpoint: /api/search_retrieval
Method: POST

{
  "paper_title": "...",
  "paper_abstract": "...",
  "critiques": [...]
}

Progress Tracking

Endpoint: /api/progress/{request_id}
Method: GET

Response:

{
  "stage": "search_retrieval",
  "progress": 0.5,
  "message": "Searching for relevant research...",
  "timestamp": "2025-01-15T10:30:00"
}

🔧 Configuration

Environment Variables

Variable	Description	Default
`GEMINI_API_KEY`	Google Gemini API key	Required
`OPENROUTER_API_KEY`	OpenRouter API key (DeepSeek)	Required
`TAVILY_API_KEY`	Tavily Search API key	Required
`SERPAPI_API_KEY`	SerpAPI key for Google Scholar	Optional
`MAX_REQUESTS_PER_MINUTE`	Rate limit	10
`MAX_CONCURRENT_TASKS`	Max parallel executions	3
`MAX_RETRIES`	Retry attempts on failure	5

Rate Limits

10 requests per minute per client IP
Maximum 3 concurrent pipeline executions
Queue size: 20 pending requests

🏗️ Architecture

api/
├── app.py                      # Main Gradio application
├── config.py                   # Configuration management
├── requirements.txt            # Python dependencies
├── pipeline/                   # Pipeline modules
│   ├── critique_extraction.py  # Gemini-based extraction
│   ├── disagreement_detection.py
│   ├── search_retrieval.py     # LangChain search agent
│   ├── disagreement_resolution.py  # DeepSeek resolution
│   └── meta_review.py
└── utils/                      # Utility modules
    ├── rate_limiter.py
    ├── queue_manager.py
    └── validators.py

🔍 Pipeline Stages

Critique Extraction (Gemini 2.0)
- Extracts structured critique points
- Categories: Methodology, Experiments, Clarity, Significance, Novelty
Disagreement Detection (Gemini 2.0)
- Compares all review pairs
- Assigns disagreement scores (0-1)
- Identifies specific conflict points
Search & Retrieval (LangChain + Multi-Search)
- SoTA research discovery
- Evidence validation
- Sources: Semantic Scholar, arXiv, Google Scholar, Tavily
Disagreement Resolution (DeepSeek-R1)
- Validates critique points
- Accepts/rejects based on evidence
- Provides resolution summaries
Meta-Review Generation (DeepSeek-R1)
- Synthesizes all analyses
- Provides final verdict
- Offers actionable recommendations

📊 Example Usage

Python

import requests

response = requests.post(
    "https://your-space.hf.space/api/full_pipeline",
    json={
        "paper_title": "Novel Approach to X",
        "paper_abstract": "We propose...",
        "reviews": [
            "Reviewer 1: Strong methodology...",
            "Reviewer 2: Weak experimental validation..."
        ]
    }
)

result = response.json()
print(result["meta_review"])

cURL

curl -X POST https://your-space.hf.space/api/full_pipeline \
  -H "Content-Type: application/json" \
  -d '{
    "paper_title": "Novel Approach to X",
    "paper_abstract": "We propose...",
    "reviews": ["Review 1...", "Review 2..."]
  }'

📝 License

See the main project LICENSE file.