hf-eda-mcp / README.md
KhalilGuetari's picture
Improve analysis tool with dataset viewer
43642a4
|
raw
history blame
1.98 kB
metadata
title: HF EDA MCP Server
emoji: 🔍
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 6.0.0
app_file: src/app.py
pinned: false
license: apache-2.0
app_port: 7860

HF EDA MCP Server

An MCP (Model Context Protocol) server that provides tools for Exploratory Data Analysis (EDA) of HuggingFace datasets.

Features

  • Dataset Metadata: Retrieve comprehensive information about HuggingFace datasets including size, features, splits, and configurations
  • Dataset Sampling: Get samples from any dataset split for quick exploration
  • Feature Analysis: Perform basic EDA with automatic optimization
    • Uses HuggingFace Dataset Viewer API for full dataset statistics (when available)
    • Automatic fallback to sample-based analysis
    • Supports multiple data types: numerical, categorical, text, image, audio
    • Includes histograms, distributions, and missing value analysis

Usage

This Space runs as an MCP server that can be accessed by MCP-compatible AI assistants.

MCP Client Configuration

Add this server to your MCP client configuration:

{
  "mcpServers": {
    "hf-eda-mcp": {
      "url": "https://YOUR-USERNAME-hf-eda-mcp.hf.space/gradio_api/mcp/sse"
    }
  }
}

Replace YOUR-USERNAME with your HuggingFace username.

Available Tools

  1. get_dataset_metadata: Get detailed information about a dataset including size, features, splits, and download statistics
  2. get_dataset_sample: Retrieve sample rows from a dataset for quick exploration
  3. analyze_dataset_features: Perform comprehensive exploratory analysis with automatic optimization
    • Automatically uses Dataset Viewer API statistics for parquet datasets (full dataset analysis)
    • Falls back to sample-based analysis for other formats
    • Returns feature types, statistics, histograms, and missing value analysis

Authentication

For private datasets, set the HF_TOKEN secret in your Space settings.

License

Apache License 2.0