fuutott commited on
Commit
f81729c
·
verified ·
1 Parent(s): 1a50710

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +71 -6
README.md CHANGED
@@ -1,14 +1,79 @@
1
  ---
2
- title: Wikipedia Entity Extractor MCP Server
3
- emoji: 🌖
4
- colorFrom: green
5
- colorTo: pink
6
  sdk: gradio
7
  sdk_version: 5.33.0
8
  app_file: app.py
9
  pinned: false
10
  license: mit
11
- short_description: 🧠 Wikipedia Entity Extractor — MCP Server
 
 
12
  ---
13
 
14
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: Wikipedia Entity Extractor MCP Server
3
+ emoji: 👀
4
+ colorFrom: pink
5
+ colorTo: red
6
  sdk: gradio
7
  sdk_version: 5.33.0
8
  app_file: app.py
9
  pinned: false
10
  license: mit
11
+ short_description: entity-recognizer
12
+ tags:
13
+ - mcp-server-track
14
  ---
15
 
16
+ # 🧠 Wikipedia Entity Extractor MCP Server
17
+
18
+ Welcome to the **Wikipedia Entity Extractor**, a simple yet powerful MCP-compatible tool built with 🤗 Gradio and 🔗 Hugging Face Inference API. This project was created as part of the Gradio Hackathon to showcase how **modular AI components** can be wired together using the **Model Context Protocol (MCP)**.
19
+
20
+ ## ✨ What It Does
21
+
22
+ - Takes **freeform user text** as input.
23
+ - Uses a **Hugging Face-hosted LLM** (like Qwen or Mixtral) to extract **named entities** likely to have a Wikipedia page.
24
+ - Searches **Wikipedia** for those entities.
25
+ - Returns a clean, structured **JSON dictionary** mapping each entity to the first paragraph of its Wikipedia article.
26
+
27
+ > “Who is Alan Turing and what is Bletchley Park?” → 🧠 → 📚 → ✅ JSON with article intros
28
+
29
+ ## 🚀 Tech Stack
30
+
31
+ - `Gradio` — UI and MCP server integration
32
+ - `huggingface_hub.InferenceClient` — for efficient, plug-and-play LLM calls
33
+ - `requests` — to fetch article summaries from Wikipedia’s REST API
34
+ - `MCP` — to allow this component to be chained with others seamlessly
35
+
36
+ ## 🔐 Secure Token Handling
37
+
38
+ This app uses a **bring-your-own-token** approach:
39
+ - The Hugging Face token is passed as a user input (masked password field).
40
+ - No secrets are exposed in the public Space.
41
+ - You can also set `HF_TOKEN` via environment variables for internal runs.
42
+
43
+ ## 🧰 Example Input
44
+
45
+ ```text
46
+ Barack Obama was born in Hawaii and studied at Harvard Law School.
47
+ ```
48
+ ## 📦 Example Output
49
+
50
+ ```json
51
+ {
52
+ "Barack Obama":
53
+ "Barack Hussein Obama II is an American politician who was the 44th president of the United States from 2009 to 2017. A member of the Democratic Party, he was the first African American president in American history. Obama previously served as a U.S. senator representing Illinois from 2005 to 2008 and as an Illinois state senator from 1997 to 2004.",
54
+ "Hawaii":
55
+ "Hawaii is an island state of the United States, in the Pacific Ocean about 2,000 miles (3,200 km) southwest of the U.S. mainland. One of the two non-contiguous U.S. states, it is the only state not on the North American mainland, the only state that is an archipelago, and the only state in the tropics.",
56
+ "Harvard Law School":
57
+ "Harvard Law School (HLS) is the law school of Harvard University, a private research university in Cambridge, Massachusetts. Founded in 1817, Harvard Law School is the oldest law school in continuous operation in the United States."
58
+ }
59
+ ```
60
+
61
+ ## 🛠 How to Run
62
+
63
+ You can run this:
64
+
65
+ set HF_TOKEN in your env
66
+ Locally with python app.py
67
+ or
68
+ In a Hugging Face Space
69
+
70
+
71
+ ## 🤝 Designed For
72
+
73
+ AI agents that need grounding via structured world knowledge
74
+
75
+ Tool-augmented chatbots
76
+
77
+ Zero-shot extract-and-retrieve pipelines
78
+
79
+ Hackathon creativity 🤖🧩🚀