MetaSearch / README.md
Tirath5504's picture
Update README.md
64acd41 verified

A newer version of the Gradio SDK is available: 6.3.0

Upgrade
metadata
title: MetaSearch API
emoji: πŸ”¬
colorFrom: blue
colorTo: indigo
sdk: gradio
sdk_version: 5.9.1
app_file: app.py
pinned: false
license: mit

πŸ”¬ Automated Consensus Analysis API

A comprehensive HuggingFace Spaces API for automated peer review consensus analysis using LLMs and search-augmented verification.

🌟 Features

  • Critique Extraction: Extract structured critique points from peer reviews using Gemini 2.0
  • Disagreement Detection: Identify conflicts and disagreements between reviewers
  • Search-Augmented Verification: Retrieve supporting/contradicting evidence from academic sources
  • Disagreement Resolution: AI-powered resolution using DeepSeek-R1 with reasoning
  • Meta-Review Generation: Comprehensive meta-reviews synthesizing all analyses
  • Rate Limiting: 10 requests per minute per client
  • Queue Management: Up to 3 concurrent pipeline executions
  • Progress Tracking: Real-time status updates for long-running tasks

πŸš€ Quick Start

Local Development

  1. Clone and setup
cd api
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt
  1. Configure environment
cp .env.example .env
# Edit .env with your API keys
  1. Run the application
python app.py

Visit http://localhost:7860 to access the Gradio interface.

HuggingFace Spaces Deployment

  1. Create a new Space

  2. Upload files

    • Upload all files from the api/ directory
    • Ensure requirements.txt and app.py are in the root
  3. Configure secrets

    • Go to Space Settings β†’ Repository secrets
    • Add the following secrets:
      • GEMINI_API_KEY
      • OPENROUTER_API_KEY
      • TAVILY_API_KEY
      • SERPAPI_API_KEY
  4. Deploy

    • The Space will automatically build and deploy

πŸ“š API Endpoints

Full Pipeline

Endpoint: /api/full_pipeline
Method: POST
Description: Run the complete consensus analysis pipeline

Request Body:

{
  "paper_title": "Visual Correspondence Hallucination",
  "paper_abstract": "This paper investigates...",
  "reviews": [
    "Review 1: The methodology is sound but...",
    "Review 2: While the experiments are comprehensive..."
  ]
}

Response:

{
  "request_id": "req_123456789",
  "paper_title": "...",
  "critique_points": [...],
  "disagreements": [...],
  "search_results": {...},
  "resolution": [...],
  "meta_review": "..."
}

Individual Stages

Critique Extraction

Endpoint: /api/critique_extraction
Method: POST

{
  "reviews": ["Review 1 text...", "Review 2 text..."]
}

Disagreement Detection

Endpoint: /api/disagreement_detection
Method: POST

{
  "critiques": [
    {"Methodology": [...], "Experiments": [...]},
    {"Methodology": [...], "Experiments": [...]}
  ]
}

Search & Retrieval

Endpoint: /api/search_retrieval
Method: POST

{
  "paper_title": "...",
  "paper_abstract": "...",
  "critiques": [...]
}

Progress Tracking

Endpoint: /api/progress/{request_id}
Method: GET

Response:

{
  "stage": "search_retrieval",
  "progress": 0.5,
  "message": "Searching for relevant research...",
  "timestamp": "2025-01-15T10:30:00"
}

πŸ”§ Configuration

Environment Variables

Variable Description Default
GEMINI_API_KEY Google Gemini API key Required
OPENROUTER_API_KEY OpenRouter API key (DeepSeek) Required
TAVILY_API_KEY Tavily Search API key Required
SERPAPI_API_KEY SerpAPI key for Google Scholar Optional
MAX_REQUESTS_PER_MINUTE Rate limit 10
MAX_CONCURRENT_TASKS Max parallel executions 3
MAX_RETRIES Retry attempts on failure 5

Rate Limits

  • 10 requests per minute per client IP
  • Maximum 3 concurrent pipeline executions
  • Queue size: 20 pending requests

πŸ—οΈ Architecture

api/
β”œβ”€β”€ app.py                      # Main Gradio application
β”œβ”€β”€ config.py                   # Configuration management
β”œβ”€β”€ requirements.txt            # Python dependencies
β”œβ”€β”€ pipeline/                   # Pipeline modules
β”‚   β”œβ”€β”€ critique_extraction.py  # Gemini-based extraction
β”‚   β”œβ”€β”€ disagreement_detection.py
β”‚   β”œβ”€β”€ search_retrieval.py     # LangChain search agent
β”‚   β”œβ”€β”€ disagreement_resolution.py  # DeepSeek resolution
β”‚   └── meta_review.py
└── utils/                      # Utility modules
    β”œβ”€β”€ rate_limiter.py
    β”œβ”€β”€ queue_manager.py
    └── validators.py

πŸ” Pipeline Stages

  1. Critique Extraction (Gemini 2.0)

    • Extracts structured critique points
    • Categories: Methodology, Experiments, Clarity, Significance, Novelty
  2. Disagreement Detection (Gemini 2.0)

    • Compares all review pairs
    • Assigns disagreement scores (0-1)
    • Identifies specific conflict points
  3. Search & Retrieval (LangChain + Multi-Search)

    • SoTA research discovery
    • Evidence validation
    • Sources: Semantic Scholar, arXiv, Google Scholar, Tavily
  4. Disagreement Resolution (DeepSeek-R1)

    • Validates critique points
    • Accepts/rejects based on evidence
    • Provides resolution summaries
  5. Meta-Review Generation (DeepSeek-R1)

    • Synthesizes all analyses
    • Provides final verdict
    • Offers actionable recommendations

πŸ“Š Example Usage

Python

import requests

response = requests.post(
    "https://your-space.hf.space/api/full_pipeline",
    json={
        "paper_title": "Novel Approach to X",
        "paper_abstract": "We propose...",
        "reviews": [
            "Reviewer 1: Strong methodology...",
            "Reviewer 2: Weak experimental validation..."
        ]
    }
)

result = response.json()
print(result["meta_review"])

cURL

curl -X POST https://your-space.hf.space/api/full_pipeline \
  -H "Content-Type: application/json" \
  -d '{
    "paper_title": "Novel Approach to X",
    "paper_abstract": "We propose...",
    "reviews": ["Review 1...", "Review 2..."]
  }'

πŸ“ License

See the main project LICENSE file.