Spaces:
Running
Running
metadata
title: HF EDA MCP Server
emoji: 🔍
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 6.0.0
app_file: src/app.py
pinned: false
license: apache-2.0
app_port: 7860
HF EDA MCP Server
An MCP (Model Context Protocol) server that provides tools for Exploratory Data Analysis (EDA) of HuggingFace datasets.
Features
- Dataset Metadata: Retrieve comprehensive information about HuggingFace datasets including size, features, splits, and configurations
- Dataset Sampling: Get samples from any dataset split for quick exploration
- Feature Analysis: Perform basic EDA with automatic optimization
- Uses HuggingFace Dataset Viewer API for full dataset statistics (when available)
- Automatic fallback to sample-based analysis
- Supports multiple data types: numerical, categorical, text, image, audio
- Includes histograms, distributions, and missing value analysis
Usage
This Space runs as an MCP server that can be accessed by MCP-compatible AI assistants.
MCP Client Configuration
Add this server to your MCP client configuration:
{
"mcpServers": {
"hf-eda-mcp": {
"url": "https://YOUR-USERNAME-hf-eda-mcp.hf.space/gradio_api/mcp/sse"
}
}
}
Replace YOUR-USERNAME with your HuggingFace username.
Available Tools
- get_dataset_metadata: Get detailed information about a dataset including size, features, splits, and download statistics
- get_dataset_sample: Retrieve sample rows from a dataset for quick exploration
- analyze_dataset_features: Perform comprehensive exploratory analysis with automatic optimization
- Automatically uses Dataset Viewer API statistics for parquet datasets (full dataset analysis)
- Falls back to sample-based analysis for other formats
- Returns feature types, statistics, histograms, and missing value analysis
Authentication
For private datasets, set the HF_TOKEN secret in your Space settings.
License
Apache License 2.0