|
|
--- |
|
|
title: Wikipedia Entity Extractor — MCP Server |
|
|
emoji: 👀 |
|
|
colorFrom: pink |
|
|
colorTo: red |
|
|
sdk: gradio |
|
|
sdk_version: 5.33.0 |
|
|
app_file: app.py |
|
|
pinned: false |
|
|
license: mit |
|
|
short_description: entity-recognizer |
|
|
tags: ["mcp-server-track"] |
|
|
video_overview: https://youtu.be/KWk1P7Br4uk |
|
|
--- |
|
|
|
|
|
|
|
|
Video link : https://youtu.be/KWk1P7Br4uk |
|
|
|
|
|
|
|
|
|
|
|
# 🧠 Wikipedia Entity Extractor — MCP Server |
|
|
|
|
|
Welcome to the **Wikipedia Entity Extractor**, a simple yet powerful MCP-compatible tool built with 🤗 Gradio and 🔗 Hugging Face Inference API. This project was created as part of the Gradio Hackathon to showcase how **modular AI components** can be wired together using the **Model Context Protocol (MCP)**. |
|
|
|
|
|
## ✨ What It Does |
|
|
|
|
|
- Takes **freeform user text** as input. |
|
|
- Uses a **Hugging Face-hosted LLM** (like Qwen or Mixtral) to extract **named entities** likely to have a Wikipedia page. |
|
|
- Searches **Wikipedia** for those entities. |
|
|
- Returns a clean, structured **JSON dictionary** mapping each entity to the first paragraph of its Wikipedia article. |
|
|
|
|
|
> “Who is Alan Turing and what is Bletchley Park?” → 🧠 → 📚 → ✅ JSON with article intros |
|
|
|
|
|
## 🚀 Tech Stack |
|
|
|
|
|
- `Gradio` — UI and MCP server integration |
|
|
- `huggingface_hub.InferenceClient` — for efficient, plug-and-play LLM calls |
|
|
- `requests` — to fetch article summaries from Wikipedia’s REST API |
|
|
- `MCP` — to allow this component to be chained with others seamlessly |
|
|
|
|
|
## 🔐 Secure Token Handling |
|
|
|
|
|
This app uses a **bring-your-own-token** approach: |
|
|
- The Hugging Face token is passed as a user input (masked password field). |
|
|
- No secrets are exposed in the public Space. |
|
|
- You can also set `HF_TOKEN` via environment variables for internal runs. |
|
|
|
|
|
## 🧰 Example Input |
|
|
|
|
|
```text |
|
|
Barack Obama was born in Hawaii and studied at Harvard Law School. |
|
|
``` |
|
|
## 📦 Example Output |
|
|
|
|
|
```json |
|
|
{ |
|
|
"Barack Obama": |
|
|
"Barack Hussein Obama II is an American politician who was the 44th president of the United States from 2009 to 2017. A member of the Democratic Party, he was the first African American president in American history. Obama previously served as a U.S. senator representing Illinois from 2005 to 2008 and as an Illinois state senator from 1997 to 2004.", |
|
|
"Hawaii": |
|
|
"Hawaii is an island state of the United States, in the Pacific Ocean about 2,000 miles (3,200 km) southwest of the U.S. mainland. One of the two non-contiguous U.S. states, it is the only state not on the North American mainland, the only state that is an archipelago, and the only state in the tropics.", |
|
|
"Harvard Law School": |
|
|
"Harvard Law School (HLS) is the law school of Harvard University, a private research university in Cambridge, Massachusetts. Founded in 1817, Harvard Law School is the oldest law school in continuous operation in the United States." |
|
|
} |
|
|
``` |
|
|
|
|
|
## 🛠 How to Run |
|
|
|
|
|
You can run this: |
|
|
|
|
|
set HF_TOKEN in your env |
|
|
Locally with python app.py |
|
|
or |
|
|
In a Hugging Face Space |
|
|
|
|
|
|
|
|
## 🤝 Designed For |
|
|
|
|
|
AI agents that need grounding via structured world knowledge |
|
|
|
|
|
Tool-augmented chatbots |
|
|
|
|
|
Zero-shot extract-and-retrieve pipelines |
|
|
|
|
|
Hackathon creativity 🤖🧩🚀 |